* [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells
@ 2025-09-16 14:11 Tariq Toukan
2025-09-16 14:11 ` [PATCH net-next V2 01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET Tariq Toukan
` (10 more replies)
0 siblings, 11 replies; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
Hi,
This series by Cosmin adds multiple doorbells usage in mlx5e driver.
See detailed description by Cosmin below [1].
Find V1 here:
https://lore.kernel.org/all/1757499891-596641-1-git-send-email-tariqt@nvidia.com/
Regards,
Tariq
V1->V2:
- added numbers to cover letter.
- fixed mlx5.rst nested list error.
- removed newline from NL_SET_ERR_MSG_FMT_MOD.
[1]
mlx5e uses a single MMIO-mapped doorbell per netdevice for all send and
receive operations. Writes to the doorbell go over the PCIe bus directly
to the device, which then services the indicated queues.
On certain architectures and with sufficiently high volume of doorbell
ringing (many cores, many active channels, small MTU, no GSO, etc.), the
MMIO-mapped doorbell address can become contended, leading to delays in
servicing writes to that address and a global slowdown of all traffic
for that netdevice.
mlx5 NICs have supported using multiple doorbells for many years, the
mlx5_ib driver for the same hardware has been using multiple doorbells
traditionally.
This patch series extends the mlx5 Ethernet driver to also use multiple
doorbells to solve the MMIO contention issues. By allocating and using
more doorbells for all channel queues (TX and RX), the MMIO contention
on any particular doorbell address is reduced significantly.
The first patches are cleanups:
net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET
net/mlx5: Remove unused 'offset' field from struct mlx5_sq_bfreg'
net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param
The next patch separates the global doorbell from Ethernet-specific
resources:
net/mlx5: Store the global doorbell in mlx5_priv
Next, plumbing to allow a different doorbell to be used for channel TX
and RX queues:
net/mlx5e: Prepare for using multiple TX doorbells
net/mlx5e: Prepare for using different CQ doorbells
Then, enable using multiple doorbells for channel queues:
net/mlx5e: Use multiple TX doorbells
net/mlx5e: Use multiple CQ doorbells
Finally, introduce a devlink parameter to control this:
devlink: Add a 'num_doorbells' driverinit param
net/mlx5e: Use the 'num_doorbells' devlink param
Some performance results, done with the Linux pktgen script, running b2b
over Connect-X 8 NICs:
samples/pktgen/pktgen_sample02_multiqueue.sh -i $NIC -s 64 -d $DST_IP \
-m $MAC -t 64
Baseline (1 doorbell): 9 Mpps
This series (8 doorbells): 56 Mpps
Note that pktgen without 'burst' rings the doorbell after every packet,
while real packet TX using NAPI usually batches multiple pending packets
with the xmit_more mechanism. So this is in essence a micro-benchmark
showcasing the improvement of using multiple doorbells on platforms
affected by MMIO contention. Real life traffic usually sees little
movement either way.
Cosmin Ratiu (10):
net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET
net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg
net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param
net/mlx5: Store the global doorbell in mlx5_priv
net/mlx5e: Prepare for using multiple TX doorbells
net/mlx5e: Prepare for using different CQ doorbells
net/mlx5e: Use multiple TX doorbells
net/mlx5e: Use multiple CQ doorbells
devlink: Add a 'num_doorbells' driverinit param
net/mlx5e: Use the 'num_doorbells' devlink param
.../networking/devlink/devlink-params.rst | 3 ++
Documentation/networking/devlink/mlx5.rst | 9 ++++
drivers/infiniband/hw/mlx5/cq.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/cq.c | 1 -
.../net/ethernet/mellanox/mlx5/core/devlink.c | 26 +++++++++++
drivers/net/ethernet/mellanox/mlx5/core/en.h | 3 ++
.../ethernet/mellanox/mlx5/core/en/params.c | 6 +--
.../ethernet/mellanox/mlx5/core/en/params.h | 2 +-
.../net/ethernet/mellanox/mlx5/core/en/ptp.c | 6 ++-
.../net/ethernet/mellanox/mlx5/core/en/ptp.h | 1 +
.../net/ethernet/mellanox/mlx5/core/en/trap.c | 1 +
.../net/ethernet/mellanox/mlx5/core/en/txrx.h | 5 +--
.../mellanox/mlx5/core/en/xsk/setup.c | 2 +-
.../ethernet/mellanox/mlx5/core/en_common.c | 45 ++++++++++++++++---
.../net/ethernet/mellanox/mlx5/core/en_main.c | 37 ++++++++++++---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 8 ++--
.../ethernet/mellanox/mlx5/core/fpga/conn.c | 1 -
.../net/ethernet/mellanox/mlx5/core/lib/aso.c | 8 ++--
.../net/ethernet/mellanox/mlx5/core/main.c | 11 +++--
.../mellanox/mlx5/core/steering/hws/send.c | 8 ++--
.../mellanox/mlx5/core/steering/sws/dr_send.c | 1 -
drivers/net/ethernet/mellanox/mlx5/core/wc.c | 16 ++++---
include/linux/mlx5/cq.h | 1 -
include/linux/mlx5/driver.h | 8 ++--
include/net/devlink.h | 4 ++
net/devlink/param.c | 5 +++
26 files changed, 163 insertions(+), 59 deletions(-)
base-commit: a4ab91f470c5c84ce0c17197c1562d29aa032340
--
2.31.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH net-next V2 01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:53 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 02/10] net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg Tariq Toukan
` (9 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
Also convert it to a simple define.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 1ab77159409d..f3c714ebd9cb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -32,9 +32,7 @@ enum {
MLX5_EQ_STATE_ALWAYS_ARMED = 0xb,
};
-enum {
- MLX5_EQ_DOORBEL_OFFSET = 0x40,
-};
+#define MLX5_EQ_DOORBELL_OFFSET 0x40
/* budget must be smaller than MLX5_NUM_SPARE_EQE to guarantee that we update
* the ci before we polled all the entries in the EQ. MLX5_NUM_SPARE_EQE is
@@ -322,7 +320,7 @@ create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
eq->eqn = MLX5_GET(create_eq_out, out, eq_number);
eq->irqn = pci_irq_vector(dev->pdev, vecidx);
eq->dev = dev;
- eq->doorbell = priv->uar->map + MLX5_EQ_DOORBEL_OFFSET;
+ eq->doorbell = priv->uar->map + MLX5_EQ_DOORBELL_OFFSET;
err = mlx5_debug_eq_add(dev, eq);
if (err)
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 02/10] net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
2025-09-16 14:11 ` [PATCH net-next V2 01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:53 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 03/10] net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param Tariq Toukan
` (8 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
The 'offset' field was introduced in the original commit [1] and never
used until commit [2], which added an unnecessary use.
Remove the field and refactor the write-combining test to use a local
variable instead.
[1] commit a6d51b68611e ("net/mlx5: Introduce blue flame register
allocator")
[2] commit d98995b4bf98 ("net/mlx5: Reimplement write combining test")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/wc.c | 12 +++++++-----
include/linux/mlx5/driver.h | 1 -
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wc.c b/drivers/net/ethernet/mellanox/mlx5/core/wc.c
index 2f0316616fa4..276594586404 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wc.c
@@ -255,7 +255,8 @@ static void mlx5_wc_destroy_sq(struct mlx5_wc_sq *sq)
mlx5_wq_destroy(&sq->wq_ctrl);
}
-static void mlx5_wc_post_nop(struct mlx5_wc_sq *sq, bool signaled)
+static void mlx5_wc_post_nop(struct mlx5_wc_sq *sq, unsigned int *offset,
+ bool signaled)
{
int buf_size = (1 << MLX5_CAP_GEN(sq->cq.mdev, log_bf_reg_size)) / 2;
struct mlx5_wqe_ctrl_seg *ctrl;
@@ -288,10 +289,10 @@ static void mlx5_wc_post_nop(struct mlx5_wc_sq *sq, bool signaled)
*/
wmb();
- __iowrite64_copy(sq->bfreg.map + sq->bfreg.offset, mmio_wqe,
+ __iowrite64_copy(sq->bfreg.map + *offset, mmio_wqe,
sizeof(mmio_wqe) / 8);
- sq->bfreg.offset ^= buf_size;
+ *offset ^= buf_size;
}
static int mlx5_wc_poll_cq(struct mlx5_wc_sq *sq)
@@ -332,6 +333,7 @@ static int mlx5_wc_poll_cq(struct mlx5_wc_sq *sq)
static void mlx5_core_test_wc(struct mlx5_core_dev *mdev)
{
+ unsigned int offset = 0;
unsigned long expires;
struct mlx5_wc_sq *sq;
int i, err;
@@ -358,9 +360,9 @@ static void mlx5_core_test_wc(struct mlx5_core_dev *mdev)
goto err_create_sq;
for (i = 0; i < TEST_WC_NUM_WQES - 1; i++)
- mlx5_wc_post_nop(sq, false);
+ mlx5_wc_post_nop(sq, &offset, false);
- mlx5_wc_post_nop(sq, true);
+ mlx5_wc_post_nop(sq, &offset, true);
expires = jiffies + TEST_WC_POLLING_MAX_TIME_JIFFIES;
do {
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index fcfc18bfeba9..5a85b6d91ba3 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -434,7 +434,6 @@ struct mlx5_sq_bfreg {
struct mlx5_uars_page *up;
bool wc;
u32 index;
- unsigned int offset;
};
struct mlx5_core_health {
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 03/10] net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
2025-09-16 14:11 ` [PATCH net-next V2 01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET Tariq Toukan
2025-09-16 14:11 ` [PATCH net-next V2 02/10] net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:53 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 04/10] net/mlx5: Store the global doorbell in mlx5_priv Tariq Toukan
` (7 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
This was added in commit [1], but its only use removed in commit [2].
The parameter is unused, so remove it from the function parameter list.
[1] commit 9ded70fa1d81 ("net/mlx5e: Don't prefill WQEs in XDP SQ in the
multi buffer mode")
[2] commit 1a9304859b3a ("net/mlx5: XDP, Enable TX side XDP multi-buffer
support")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 3 +--
drivers/net/ethernet/mellanox/mlx5/core/en/params.h | 1 -
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c | 2 +-
3 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 3cca06a74cf9..31e7f59bc19b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -1229,7 +1229,6 @@ static void mlx5e_build_async_icosq_param(struct mlx5_core_dev *mdev,
void mlx5e_build_xdpsq_param(struct mlx5_core_dev *mdev,
struct mlx5e_params *params,
- struct mlx5e_xsk_param *xsk,
struct mlx5e_sq_param *param)
{
void *sqc = param->sqc;
@@ -1256,7 +1255,7 @@ int mlx5e_build_channel_param(struct mlx5_core_dev *mdev,
async_icosq_log_wq_sz = mlx5e_build_async_icosq_log_wq_sz(mdev);
mlx5e_build_sq_param(mdev, params, &cparam->txq_sq);
- mlx5e_build_xdpsq_param(mdev, params, NULL, &cparam->xdp_sq);
+ mlx5e_build_xdpsq_param(mdev, params, &cparam->xdp_sq);
mlx5e_build_icosq_param(mdev, icosq_log_wq_sz, &cparam->icosq);
mlx5e_build_async_icosq_param(mdev, async_icosq_log_wq_sz, &cparam->async_icosq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
index 488ccdbc1e2c..e3edf79dde5f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -132,7 +132,6 @@ void mlx5e_build_tx_cq_param(struct mlx5_core_dev *mdev,
struct mlx5e_cq_param *param);
void mlx5e_build_xdpsq_param(struct mlx5_core_dev *mdev,
struct mlx5e_params *params,
- struct mlx5e_xsk_param *xsk,
struct mlx5e_sq_param *param);
int mlx5e_build_channel_param(struct mlx5_core_dev *mdev,
struct mlx5e_params *params,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
index d743e823362a..dbd88eb5c082 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c
@@ -54,7 +54,7 @@ static void mlx5e_build_xsk_cparam(struct mlx5_core_dev *mdev,
struct mlx5e_channel_param *cparam)
{
mlx5e_build_rq_param(mdev, params, xsk, &cparam->rq);
- mlx5e_build_xdpsq_param(mdev, params, xsk, &cparam->xdp_sq);
+ mlx5e_build_xdpsq_param(mdev, params, &cparam->xdp_sq);
}
static int mlx5e_init_xsk_rq(struct mlx5e_channel *c,
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 04/10] net/mlx5: Store the global doorbell in mlx5_priv
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (2 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 03/10] net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:54 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 05/10] net/mlx5e: Prepare for using multiple TX doorbells Tariq Toukan
` (6 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
The global doorbell is used for more than just Ethernet resources, so
move it out of mlx5e_hw_objs into a common place (mlx5_priv), to avoid
non-Ethernet modules (e.g. HWS, ASO) depending on Ethernet structs.
Use this opportunity to consolidate it with the 'uar' pointer already
there, which was used as an RX doorbell. Underneath the 'uar' pointer is
identical to 'bfreg->up', so store a single resource and use that
instead.
For CQ doorbells, care is taken to always use bfreg->up->index instead
of bfreg->index, which may refer to a subsequent UAR page from the same
ALLOC_UAR batch on some NICs.
This paves the way for cleanly supporting multiple doorbells in the
Ethernet driver.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/infiniband/hw/mlx5/cq.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/cq.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en_common.c | 11 +----------
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++-----
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/lib/aso.c | 8 ++++----
drivers/net/ethernet/mellanox/mlx5/core/main.c | 11 +++++------
.../ethernet/mellanox/mlx5/core/steering/hws/send.c | 8 ++++----
drivers/net/ethernet/mellanox/mlx5/core/wc.c | 4 ++--
include/linux/mlx5/driver.h | 3 +--
12 files changed, 29 insertions(+), 40 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 9c8003a78334..a23b364e24ff 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -648,7 +648,7 @@ int mlx5_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
{
struct mlx5_core_dev *mdev = to_mdev(ibcq->device)->mdev;
struct mlx5_ib_cq *cq = to_mcq(ibcq);
- void __iomem *uar_page = mdev->priv.uar->map;
+ void __iomem *uar_page = mdev->priv.bfreg.up->map;
unsigned long irq_flags;
int ret = 0;
@@ -923,7 +923,7 @@ static int create_cq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq,
cq->buf.frag_buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
- *index = dev->mdev->priv.uar->index;
+ *index = dev->mdev->priv.bfreg.up->index;
return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cq.c b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
index 1fd403713baf..35039a95dcfd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
@@ -145,7 +145,7 @@ int mlx5_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
mlx5_core_dbg(dev, "failed adding CP 0x%x to debug file system\n",
cq->cqn);
- cq->uar = dev->priv.uar;
+ cq->uar = dev->priv.bfreg.up;
cq->irqn = eq->core.irqn;
return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 31e7f59bc19b..b6b4ae7c59fa 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -810,7 +810,7 @@ static void mlx5e_build_common_cq_param(struct mlx5_core_dev *mdev,
{
void *cqc = param->cqc;
- MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc, uar_page, mdev->priv.bfreg.up->index);
if (MLX5_CAP_GEN(mdev, cqe_128_always) && cache_line_size() >= 128)
MLX5_SET(cqc, cqc, cqe_sz, CQE_STRIDE_128_PAD);
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
index 391b4e9c9dc4..7c1d9a9ea464 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
@@ -334,7 +334,7 @@ static int mlx5e_ptp_alloc_txqsq(struct mlx5e_ptp *c, int txq_ix,
sq->mdev = mdev;
sq->ch_ix = MLX5E_PTP_CHANNEL_IX;
sq->txq_ix = txq_ix;
- sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map;
+ sq->uar_map = mdev->priv.bfreg.map;
sq->min_inline_mode = params->tx_min_inline_mode;
sq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
sq->stats = &c->priv->ptp_stats.sq[tc];
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index 6ed3a32b7e22..e9e36358c39d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -163,17 +163,11 @@ int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev, bool create_tises)
goto err_dealloc_transport_domain;
}
- err = mlx5_alloc_bfreg(mdev, &res->bfreg, false, false);
- if (err) {
- mlx5_core_err(mdev, "alloc bfreg failed, %d\n", err);
- goto err_destroy_mkey;
- }
-
if (create_tises) {
err = mlx5e_create_tises(mdev, res->tisn);
if (err) {
mlx5_core_err(mdev, "alloc tises failed, %d\n", err);
- goto err_destroy_bfreg;
+ goto err_destroy_mkey;
}
res->tisn_valid = true;
}
@@ -190,8 +184,6 @@ int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev, bool create_tises)
return 0;
-err_destroy_bfreg:
- mlx5_free_bfreg(mdev, &res->bfreg);
err_destroy_mkey:
mlx5_core_destroy_mkey(mdev, res->mkey);
err_dealloc_transport_domain:
@@ -209,7 +201,6 @@ void mlx5e_destroy_mdev_resources(struct mlx5_core_dev *mdev)
mdev->mlx5e_res.dek_priv = NULL;
if (res->tisn_valid)
mlx5e_destroy_tises(mdev, res->tisn);
- mlx5_free_bfreg(mdev, &res->bfreg);
mlx5_core_destroy_mkey(mdev, res->mkey);
mlx5_core_dealloc_transport_domain(mdev, res->td.tdn);
mlx5_core_dealloc_pd(mdev, res->pdn);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 714cce595692..02a538ec2ecb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1532,7 +1532,7 @@ static int mlx5e_alloc_xdpsq(struct mlx5e_channel *c,
sq->pdev = c->pdev;
sq->mkey_be = c->mkey_be;
sq->channel = c;
- sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map;
+ sq->uar_map = mdev->priv.bfreg.map;
sq->min_inline_mode = params->tx_min_inline_mode;
sq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu) - ETH_FCS_LEN;
sq->xsk_pool = xsk_pool;
@@ -1617,7 +1617,7 @@ static int mlx5e_alloc_icosq(struct mlx5e_channel *c,
int err;
sq->channel = c;
- sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map;
+ sq->uar_map = mdev->priv.bfreg.map;
sq->reserved_room = param->stop_room;
param->wq.db_numa_node = cpu_to_node(c->cpu);
@@ -1702,7 +1702,7 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
sq->priv = c->priv;
sq->ch_ix = c->ix;
sq->txq_ix = txq_ix;
- sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map;
+ sq->uar_map = mdev->priv.bfreg.map;
sq->min_inline_mode = params->tx_min_inline_mode;
sq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
sq->max_sq_mpw_wqebbs = mlx5e_get_max_sq_aligned_wqebbs(mdev);
@@ -1778,7 +1778,7 @@ static int mlx5e_create_sq(struct mlx5_core_dev *mdev,
MLX5_SET(sqc, sqc, flush_in_error_en, 1);
MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
- MLX5_SET(wq, wq, uar_page, mdev->mlx5e_res.hw_objs.bfreg.index);
+ MLX5_SET(wq, wq, uar_page, mdev->priv.bfreg.index);
MLX5_SET(wq, wq, log_wq_pg_sz, csp->wq_ctrl->buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(wq, wq, dbr_addr, csp->wq_ctrl->db.dma);
@@ -2273,7 +2273,7 @@ static int mlx5e_create_cq(struct mlx5e_cq *cq, struct mlx5e_cq_param *param)
MLX5_SET(cqc, cqc, cq_period_mode, mlx5e_cq_period_mode(param->cq_period_mode));
MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn);
- MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc, uar_page, mdev->priv.bfreg.up->index);
MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index f3c714ebd9cb..25499da177bc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -307,7 +307,7 @@ create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
eqc = MLX5_ADDR_OF(create_eq_in, in, eq_context_entry);
MLX5_SET(eqc, eqc, log_eq_size, eq->fbc.log_sz);
- MLX5_SET(eqc, eqc, uar_page, priv->uar->index);
+ MLX5_SET(eqc, eqc, uar_page, priv->bfreg.up->index);
MLX5_SET(eqc, eqc, intr, vecidx);
MLX5_SET(eqc, eqc, log_page_size,
eq->frag_buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT);
@@ -320,7 +320,7 @@ create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
eq->eqn = MLX5_GET(create_eq_out, out, eq_number);
eq->irqn = pci_irq_vector(dev->pdev, vecidx);
eq->dev = dev;
- eq->doorbell = priv->uar->map + MLX5_EQ_DOORBELL_OFFSET;
+ eq->doorbell = priv->bfreg.up->map + MLX5_EQ_DOORBELL_OFFSET;
err = mlx5_debug_eq_add(dev, eq);
if (err)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/aso.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/aso.c
index 58bd749b5e4d..129725159a93 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/aso.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/aso.c
@@ -100,7 +100,7 @@ static int create_aso_cq(struct mlx5_aso_cq *cq, void *cqc_data)
MLX5_SET(cqc, cqc, cq_period_mode, MLX5_CQ_PERIOD_MODE_START_FROM_EQE);
MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn);
- MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc, uar_page, mdev->priv.bfreg.up->index);
MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma);
@@ -129,7 +129,7 @@ static int mlx5_aso_create_cq(struct mlx5_core_dev *mdev, int numa_node,
return -ENOMEM;
MLX5_SET(cqc, cqc_data, log_cq_size, 1);
- MLX5_SET(cqc, cqc_data, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc_data, uar_page, mdev->priv.bfreg.up->index);
if (MLX5_CAP_GEN(mdev, cqe_128_always) && cache_line_size() >= 128)
MLX5_SET(cqc, cqc_data, cqe_sz, CQE_STRIDE_128_PAD);
@@ -163,7 +163,7 @@ static int mlx5_aso_alloc_sq(struct mlx5_core_dev *mdev, int numa_node,
struct mlx5_wq_param param;
int err;
- sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map;
+ sq->uar_map = mdev->priv.bfreg.map;
param.db_numa_node = numa_node;
param.buf_numa_node = numa_node;
@@ -203,7 +203,7 @@ static int create_aso_sq(struct mlx5_core_dev *mdev, int pdn,
MLX5_SET(sqc, sqc, ts_format, ts_format);
MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
- MLX5_SET(wq, wq, uar_page, mdev->mlx5e_res.hw_objs.bfreg.index);
+ MLX5_SET(wq, wq, uar_page, mdev->priv.bfreg.index);
MLX5_SET(wq, wq, log_wq_pg_sz, sq->wq_ctrl.buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(wq, wq, dbr_addr, sq->wq_ctrl.db.dma);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 0951c7cc1b5f..89b224d76186 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1340,10 +1340,9 @@ static int mlx5_load(struct mlx5_core_dev *dev)
{
int err;
- dev->priv.uar = mlx5_get_uars_page(dev);
- if (IS_ERR(dev->priv.uar)) {
- mlx5_core_err(dev, "Failed allocating uar, aborting\n");
- err = PTR_ERR(dev->priv.uar);
+ err = mlx5_alloc_bfreg(dev, &dev->priv.bfreg, false, false);
+ if (err) {
+ mlx5_core_err(dev, "Failed allocating bfreg, %d\n", err);
return err;
}
@@ -1454,7 +1453,7 @@ static int mlx5_load(struct mlx5_core_dev *dev)
err_irq_table:
mlx5_pagealloc_stop(dev);
mlx5_events_stop(dev);
- mlx5_put_uars_page(dev, dev->priv.uar);
+ mlx5_free_bfreg(dev, &dev->priv.bfreg);
return err;
}
@@ -1479,7 +1478,7 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
mlx5_irq_table_destroy(dev);
mlx5_pagealloc_stop(dev);
mlx5_events_stop(dev);
- mlx5_put_uars_page(dev, dev->priv.uar);
+ mlx5_free_bfreg(dev, &dev->priv.bfreg);
}
int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c
index b0595c9b09e4..24ef7d66fa8a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/hws/send.c
@@ -690,7 +690,7 @@ static int hws_send_ring_alloc_sq(struct mlx5_core_dev *mdev,
size_t buf_sz;
int err;
- sq->uar_map = mdev->mlx5e_res.hw_objs.bfreg.map;
+ sq->uar_map = mdev->priv.bfreg.map;
sq->mdev = mdev;
param.db_numa_node = numa_node;
@@ -764,7 +764,7 @@ static int hws_send_ring_create_sq(struct mlx5_core_dev *mdev, u32 pdn,
MLX5_SET(sqc, sqc, ts_format, ts_format);
MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
- MLX5_SET(wq, wq, uar_page, mdev->mlx5e_res.hw_objs.bfreg.index);
+ MLX5_SET(wq, wq, uar_page, mdev->priv.bfreg.index);
MLX5_SET(wq, wq, log_wq_pg_sz, sq->wq_ctrl.buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(wq, wq, dbr_addr, sq->wq_ctrl.db.dma);
@@ -940,7 +940,7 @@ static int hws_send_ring_create_cq(struct mlx5_core_dev *mdev,
(__be64 *)MLX5_ADDR_OF(create_cq_in, in, pas));
MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn);
- MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc, uar_page, mdev->priv.bfreg.up->index);
MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma);
@@ -963,7 +963,7 @@ static int hws_send_ring_open_cq(struct mlx5_core_dev *mdev,
if (!cqc_data)
return -ENOMEM;
- MLX5_SET(cqc, cqc_data, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc_data, uar_page, mdev->priv.bfreg.up->index);
MLX5_SET(cqc, cqc_data, log_cq_size, ilog2(queue->num_entries));
err = hws_send_ring_alloc_cq(mdev, numa_node, queue, cqc_data, cq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wc.c b/drivers/net/ethernet/mellanox/mlx5/core/wc.c
index 276594586404..999d6216648a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wc.c
@@ -94,7 +94,7 @@ static int create_wc_cq(struct mlx5_wc_cq *cq, void *cqc_data)
MLX5_SET(cqc, cqc, cq_period_mode, MLX5_CQ_PERIOD_MODE_START_FROM_EQE);
MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn);
- MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc, uar_page, mdev->priv.bfreg.up->index);
MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma);
@@ -116,7 +116,7 @@ static int mlx5_wc_create_cq(struct mlx5_core_dev *mdev, struct mlx5_wc_cq *cq)
return -ENOMEM;
MLX5_SET(cqc, cqc, log_cq_size, TEST_WC_LOG_CQ_SZ);
- MLX5_SET(cqc, cqc, uar_page, mdev->priv.uar->index);
+ MLX5_SET(cqc, cqc, uar_page, mdev->priv.bfreg.up->index);
if (MLX5_CAP_GEN(mdev, cqe_128_always) && cache_line_size() >= 128)
MLX5_SET(cqc, cqc, cqe_sz, CQE_STRIDE_128_PAD);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 5a85b6d91ba3..15c434fedff7 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -612,7 +612,7 @@ struct mlx5_priv {
struct mlx5_ft_pool *ft_pool;
struct mlx5_bfreg_data bfregs;
- struct mlx5_uars_page *uar;
+ struct mlx5_sq_bfreg bfreg;
#ifdef CONFIG_MLX5_SF
struct mlx5_vhca_state_notifier *vhca_state_notifier;
struct mlx5_sf_dev_table *sf_dev_table;
@@ -658,7 +658,6 @@ struct mlx5e_resources {
u32 pdn;
struct mlx5_td td;
u32 mkey;
- struct mlx5_sq_bfreg bfreg;
#define MLX5_MAX_NUM_TC 8
u32 tisn[MLX5_MAX_PORTS][MLX5_MAX_NUM_TC];
bool tisn_valid;
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 05/10] net/mlx5e: Prepare for using multiple TX doorbells
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (3 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 04/10] net/mlx5: Store the global doorbell in mlx5_priv Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:54 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 06/10] net/mlx5e: Prepare for using different CQ doorbells Tariq Toukan
` (5 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
The driver allocates a single doorbell per device and uses
it for all Send Queues (SQs). This can become a bottleneck due to the
high number of concurrent MMIO accesses when ringing the same doorbell
from many channels.
This patch makes the doorbells used by channel queues configurable.
mlx5e_channel_pick_doorbell() is added to select the doorbell to be used
for a given channel, picking the default for now.
When opening a channel, the selected doorbell is saved to the channel
struct and used whenever channel-related queues are created.
Finally, 'uar_page' is added to 'struct mlx5e_create_sq_param' to
control which doorbell to use when allocating an SQ, since that can
happen outside channel context (e.g. for PTP).
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
.../ethernet/mellanox/mlx5/core/en/params.h | 1 +
.../net/ethernet/mellanox/mlx5/core/en/ptp.c | 4 +++-
.../net/ethernet/mellanox/mlx5/core/en/ptp.h | 1 +
.../net/ethernet/mellanox/mlx5/core/en_main.c | 18 ++++++++++++++----
5 files changed, 20 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 0dd3bc0f4caa..9c73165653bf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -788,6 +788,7 @@ struct mlx5e_channel {
int vec_ix;
int sd_ix;
int cpu;
+ struct mlx5_sq_bfreg *bfreg;
/* Sync between icosq recovery and XSK enable/disable. */
struct mutex icosq_recovery_lock;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
index e3edf79dde5f..00617c65fe3c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -51,6 +51,7 @@ struct mlx5e_create_sq_param {
u32 tisn;
u8 tis_lst_sz;
u8 min_inline_mode;
+ u32 uar_page;
};
/* Striding RQ dynamic parameters */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
index 7c1d9a9ea464..a392578a063c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
@@ -334,7 +334,7 @@ static int mlx5e_ptp_alloc_txqsq(struct mlx5e_ptp *c, int txq_ix,
sq->mdev = mdev;
sq->ch_ix = MLX5E_PTP_CHANNEL_IX;
sq->txq_ix = txq_ix;
- sq->uar_map = mdev->priv.bfreg.map;
+ sq->uar_map = c->bfreg->map;
sq->min_inline_mode = params->tx_min_inline_mode;
sq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
sq->stats = &c->priv->ptp_stats.sq[tc];
@@ -486,6 +486,7 @@ static int mlx5e_ptp_open_txqsq(struct mlx5e_ptp *c, u32 tisn,
csp.wq_ctrl = &txqsq->wq_ctrl;
csp.min_inline_mode = txqsq->min_inline_mode;
csp.ts_cqe_to_dest_cqn = ptpsq->ts_cq.mcq.cqn;
+ csp.uar_page = c->bfreg->index;
err = mlx5e_create_sq_rdy(c->mdev, sqp, &csp, 0, &txqsq->sqn);
if (err)
@@ -900,6 +901,7 @@ int mlx5e_ptp_open(struct mlx5e_priv *priv, struct mlx5e_params *params,
c->num_tc = mlx5e_get_dcb_num_tc(params);
c->stats = &priv->ptp_stats.ch;
c->lag_port = lag_port;
+ c->bfreg = &mdev->priv.bfreg;
err = mlx5e_ptp_set_state(c, params);
if (err)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h
index 883c044852f1..1b3c9648220b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h
@@ -66,6 +66,7 @@ struct mlx5e_ptp {
struct mlx5_core_dev *mdev;
struct hwtstamp_config *tstamp;
DECLARE_BITMAP(state, MLX5E_PTP_STATE_NUM_STATES);
+ struct mlx5_sq_bfreg *bfreg;
};
static inline bool mlx5e_use_ptpsq(struct sk_buff *skb)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 02a538ec2ecb..0425f0e3d3a0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1532,7 +1532,7 @@ static int mlx5e_alloc_xdpsq(struct mlx5e_channel *c,
sq->pdev = c->pdev;
sq->mkey_be = c->mkey_be;
sq->channel = c;
- sq->uar_map = mdev->priv.bfreg.map;
+ sq->uar_map = c->bfreg->map;
sq->min_inline_mode = params->tx_min_inline_mode;
sq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu) - ETH_FCS_LEN;
sq->xsk_pool = xsk_pool;
@@ -1617,7 +1617,7 @@ static int mlx5e_alloc_icosq(struct mlx5e_channel *c,
int err;
sq->channel = c;
- sq->uar_map = mdev->priv.bfreg.map;
+ sq->uar_map = c->bfreg->map;
sq->reserved_room = param->stop_room;
param->wq.db_numa_node = cpu_to_node(c->cpu);
@@ -1702,7 +1702,7 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
sq->priv = c->priv;
sq->ch_ix = c->ix;
sq->txq_ix = txq_ix;
- sq->uar_map = mdev->priv.bfreg.map;
+ sq->uar_map = c->bfreg->map;
sq->min_inline_mode = params->tx_min_inline_mode;
sq->hw_mtu = MLX5E_SW2HW_MTU(params, params->sw_mtu);
sq->max_sq_mpw_wqebbs = mlx5e_get_max_sq_aligned_wqebbs(mdev);
@@ -1778,7 +1778,7 @@ static int mlx5e_create_sq(struct mlx5_core_dev *mdev,
MLX5_SET(sqc, sqc, flush_in_error_en, 1);
MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
- MLX5_SET(wq, wq, uar_page, mdev->priv.bfreg.index);
+ MLX5_SET(wq, wq, uar_page, csp->uar_page);
MLX5_SET(wq, wq, log_wq_pg_sz, csp->wq_ctrl->buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(wq, wq, dbr_addr, csp->wq_ctrl->db.dma);
@@ -1882,6 +1882,7 @@ int mlx5e_open_txqsq(struct mlx5e_channel *c, u32 tisn, int txq_ix,
csp.cqn = sq->cq.mcq.cqn;
csp.wq_ctrl = &sq->wq_ctrl;
csp.min_inline_mode = sq->min_inline_mode;
+ csp.uar_page = c->bfreg->index;
err = mlx5e_create_sq_rdy(c->mdev, param, &csp, qos_queue_group_id, &sq->sqn);
if (err)
goto err_free_txqsq;
@@ -2052,6 +2053,7 @@ static int mlx5e_open_icosq(struct mlx5e_channel *c, struct mlx5e_params *params
csp.cqn = sq->cq.mcq.cqn;
csp.wq_ctrl = &sq->wq_ctrl;
csp.min_inline_mode = params->tx_min_inline_mode;
+ csp.uar_page = c->bfreg->index;
err = mlx5e_create_sq_rdy(c->mdev, param, &csp, 0, &sq->sqn);
if (err)
goto err_free_icosq;
@@ -2112,6 +2114,7 @@ int mlx5e_open_xdpsq(struct mlx5e_channel *c, struct mlx5e_params *params,
csp.cqn = sq->cq.mcq.cqn;
csp.wq_ctrl = &sq->wq_ctrl;
csp.min_inline_mode = sq->min_inline_mode;
+ csp.uar_page = c->bfreg->index;
set_bit(MLX5E_SQ_STATE_ENABLED, &sq->state);
err = mlx5e_create_sq_rdy(c->mdev, param, &csp, 0, &sq->sqn);
@@ -2740,6 +2743,11 @@ void mlx5e_trigger_napi_sched(struct napi_struct *napi)
local_bh_enable();
}
+static void mlx5e_channel_pick_doorbell(struct mlx5e_channel *c)
+{
+ c->bfreg = &c->mdev->priv.bfreg;
+}
+
static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
struct mlx5e_params *params,
struct xsk_buff_pool *xsk_pool,
@@ -2794,6 +2802,8 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
c->aff_mask = irq_get_effective_affinity_mask(irq);
c->lag_port = mlx5e_enumerate_lag_port(mdev, ix);
+ mlx5e_channel_pick_doorbell(c);
+
netif_napi_add_config_locked(netdev, &c->napi, mlx5e_napi_poll, ix);
netif_napi_set_irq_locked(&c->napi, irq);
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 06/10] net/mlx5e: Prepare for using different CQ doorbells
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (4 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 05/10] net/mlx5e: Prepare for using multiple TX doorbells Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:54 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 07/10] net/mlx5e: Use multiple TX doorbells Tariq Toukan
` (4 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
Completion queues (CQs) in mlx5 use the same global doorbell, which may
become contended when accessed concurrently from many cores.
This patch prepares the CQ management code for supporting different
doorbells per CQ. This will be used in downstream patches to allow
separate doorbells to be used by channels CQs.
The main change is moving the 'uar' pointer from struct mlx5_core_cq to
struct mlx5e_cq, as the uar page to be used is better off stored
directly there. Other users of mlx5_core_cq also store the UAR to be
used separately and therefore the pointer being removed is dead weight
for them. As evidence, in this patch there are two users which set the
mcq.uar pointer but didn't use it, Software Steering and old Innova CQ
creation code. Instead, they rang the doorbell directly from another
pointer.
The 'uar' pointer added to struct mlx5e_cq remains in a hot cacheline
(as before), because it may get accessed for each packet.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/cq.c | 1 -
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 5 +----
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++++---
drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c | 1 -
.../ethernet/mellanox/mlx5/core/steering/sws/dr_send.c | 1 -
include/linux/mlx5/cq.h | 1 -
7 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cq.c b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
index 35039a95dcfd..e9f319a9bdd6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
@@ -145,7 +145,6 @@ int mlx5_create_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
mlx5_core_dbg(dev, "failed adding CP 0x%x to debug file system\n",
cq->cqn);
- cq->uar = dev->priv.bfreg.up;
cq->irqn = eq->core.irqn;
return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 9c73165653bf..1cbe3f3037bb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -344,6 +344,7 @@ struct mlx5e_cq {
/* data path - accessed per napi poll */
u16 event_ctr;
struct napi_struct *napi;
+ struct mlx5_uars_page *uar;
struct mlx5_core_cq mcq;
struct mlx5e_ch_stats *ch_stats;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index 5dc04bbfc71b..6760bb0336df 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -309,10 +309,7 @@ mlx5e_notify_hw(struct mlx5_wq_cyc *wq, u16 pc, void __iomem *uar_map,
static inline void mlx5e_cq_arm(struct mlx5e_cq *cq)
{
- struct mlx5_core_cq *mcq;
-
- mcq = &cq->mcq;
- mlx5_cq_arm(mcq, MLX5_CQ_DB_REQ_NOT, mcq->uar->map, cq->wq.cc);
+ mlx5_cq_arm(&cq->mcq, MLX5_CQ_DB_REQ_NOT, cq->uar->map, cq->wq.cc);
}
static inline struct mlx5e_sq_dma *
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0425f0e3d3a0..ef7598e048b2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2185,6 +2185,7 @@ static void mlx5e_close_xdpredirect_sq(struct mlx5e_xdpsq *xdpsq)
static int mlx5e_alloc_cq_common(struct mlx5_core_dev *mdev,
struct net_device *netdev,
struct workqueue_struct *workqueue,
+ struct mlx5_uars_page *uar,
struct mlx5e_cq_param *param,
struct mlx5e_cq *cq)
{
@@ -2216,6 +2217,7 @@ static int mlx5e_alloc_cq_common(struct mlx5_core_dev *mdev,
cq->mdev = mdev;
cq->netdev = netdev;
cq->workqueue = workqueue;
+ cq->uar = uar;
return 0;
}
@@ -2231,7 +2233,8 @@ static int mlx5e_alloc_cq(struct mlx5_core_dev *mdev,
param->wq.db_numa_node = ccp->node;
param->eq_ix = ccp->ix;
- err = mlx5e_alloc_cq_common(mdev, ccp->netdev, ccp->wq, param, cq);
+ err = mlx5e_alloc_cq_common(mdev, ccp->netdev, ccp->wq,
+ mdev->priv.bfreg.up, param, cq);
cq->napi = ccp->napi;
cq->ch_stats = ccp->ch_stats;
@@ -2276,7 +2279,7 @@ static int mlx5e_create_cq(struct mlx5e_cq *cq, struct mlx5e_cq_param *param)
MLX5_SET(cqc, cqc, cq_period_mode, mlx5e_cq_period_mode(param->cq_period_mode));
MLX5_SET(cqc, cqc, c_eqn_or_apu_element, eqn);
- MLX5_SET(cqc, cqc, uar_page, mdev->priv.bfreg.up->index);
+ MLX5_SET(cqc, cqc, uar_page, cq->uar->index);
MLX5_SET(cqc, cqc, log_page_size, cq->wq_ctrl.buf.page_shift -
MLX5_ADAPTER_PAGE_SHIFT);
MLX5_SET64(cqc, cqc, dbr_addr, cq->wq_ctrl.db.dma);
@@ -3589,7 +3592,8 @@ static int mlx5e_alloc_drop_cq(struct mlx5e_priv *priv,
param->wq.buf_numa_node = dev_to_node(mlx5_core_dma_dev(mdev));
param->wq.db_numa_node = dev_to_node(mlx5_core_dma_dev(mdev));
- return mlx5e_alloc_cq_common(priv->mdev, priv->netdev, priv->wq, param, cq);
+ return mlx5e_alloc_cq_common(priv->mdev, priv->netdev, priv->wq,
+ mdev->priv.bfreg.up, param, cq);
}
int mlx5e_open_drop_rq(struct mlx5e_priv *priv,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
index c4de6bf8d1b6..cb1319974f83 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.c
@@ -475,7 +475,6 @@ static int mlx5_fpga_conn_create_cq(struct mlx5_fpga_conn *conn, int cq_size)
*conn->cq.mcq.arm_db = 0;
conn->cq.mcq.vector = 0;
conn->cq.mcq.comp = mlx5_fpga_conn_cq_complete;
- conn->cq.mcq.uar = fdev->conn_res.uar;
tasklet_setup(&conn->cq.tasklet, mlx5_fpga_conn_cq_tasklet);
mlx5_fpga_dbg(fdev, "Created CQ #0x%x\n", conn->cq.mcq.cqn);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_send.c
index 4fd4e8483382..077a77fde670 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/sws/dr_send.c
@@ -1131,7 +1131,6 @@ static struct mlx5dr_cq *dr_create_cq(struct mlx5_core_dev *mdev,
*cq->mcq.arm_db = cpu_to_be32(2 << 28);
cq->mcq.vector = 0;
- cq->mcq.uar = uar;
cq->mdev = mdev;
return cq;
diff --git a/include/linux/mlx5/cq.h b/include/linux/mlx5/cq.h
index 991526039ccb..7ef2c7c7d803 100644
--- a/include/linux/mlx5/cq.h
+++ b/include/linux/mlx5/cq.h
@@ -41,7 +41,6 @@ struct mlx5_core_cq {
int cqe_sz;
__be32 *set_ci_db;
__be32 *arm_db;
- struct mlx5_uars_page *uar;
refcount_t refcount;
struct completion free;
unsigned vector;
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 07/10] net/mlx5e: Use multiple TX doorbells
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (5 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 06/10] net/mlx5e: Prepare for using different CQ doorbells Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:55 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 08/10] net/mlx5e: Use multiple CQ doorbells Tariq Toukan
` (3 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
First, allocate more doorbells in mlx5e_create_mdev_resources:
- one doorbell remains 'global' and will be used by all non-channel
associated SQs (e.g. ASO, HWS, PTP, ...).
- allocate additional 'num_doorbells' doorbells. This defaults to
minimum between 8 and max number of channels.
mlx5e_channel_pick_doorbell() now spreads out channel SQs across
available doorbells.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
.../ethernet/mellanox/mlx5/core/en_common.c | 29 ++++++++++++++++++-
.../net/ethernet/mellanox/mlx5/core/en_main.c | 11 ++++++-
include/linux/mlx5/driver.h | 4 +++
3 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index e9e36358c39d..d13cebbc763a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -143,6 +143,7 @@ static int mlx5e_create_tises(struct mlx5_core_dev *mdev, u32 tisn[MLX5_MAX_PORT
int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev, bool create_tises)
{
struct mlx5e_hw_objs *res = &mdev->mlx5e_res.hw_objs;
+ unsigned int num_doorbells, i;
int err;
err = mlx5_core_alloc_pd(mdev, &res->pdn);
@@ -163,11 +164,30 @@ int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev, bool create_tises)
goto err_dealloc_transport_domain;
}
+ num_doorbells = min(MLX5_DEFAULT_NUM_DOORBELLS,
+ mlx5e_get_max_num_channels(mdev));
+ res->bfregs = kcalloc(num_doorbells, sizeof(*res->bfregs), GFP_KERNEL);
+ if (!res->bfregs) {
+ err = -ENOMEM;
+ goto err_destroy_mkey;
+ }
+
+ for (i = 0; i < num_doorbells; i++) {
+ err = mlx5_alloc_bfreg(mdev, res->bfregs + i, false, false);
+ if (err) {
+ mlx5_core_warn(mdev,
+ "could only allocate %d/%d doorbells, err %d.\n",
+ i, num_doorbells, err);
+ break;
+ }
+ }
+ res->num_bfregs = i;
+
if (create_tises) {
err = mlx5e_create_tises(mdev, res->tisn);
if (err) {
mlx5_core_err(mdev, "alloc tises failed, %d\n", err);
- goto err_destroy_mkey;
+ goto err_destroy_bfregs;
}
res->tisn_valid = true;
}
@@ -184,6 +204,10 @@ int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev, bool create_tises)
return 0;
+err_destroy_bfregs:
+ for (i = 0; i < res->num_bfregs; i++)
+ mlx5_free_bfreg(mdev, res->bfregs + i);
+ kfree(res->bfregs);
err_destroy_mkey:
mlx5_core_destroy_mkey(mdev, res->mkey);
err_dealloc_transport_domain:
@@ -201,6 +225,9 @@ void mlx5e_destroy_mdev_resources(struct mlx5_core_dev *mdev)
mdev->mlx5e_res.dek_priv = NULL;
if (res->tisn_valid)
mlx5e_destroy_tises(mdev, res->tisn);
+ for (unsigned int i = 0; i < res->num_bfregs; i++)
+ mlx5_free_bfreg(mdev, res->bfregs + i);
+ kfree(res->bfregs);
mlx5_core_destroy_mkey(mdev, res->mkey);
mlx5_core_dealloc_transport_domain(mdev, res->td.tdn);
mlx5_core_dealloc_pd(mdev, res->pdn);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index ef7598e048b2..4dee4c6d048d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2748,7 +2748,16 @@ void mlx5e_trigger_napi_sched(struct napi_struct *napi)
static void mlx5e_channel_pick_doorbell(struct mlx5e_channel *c)
{
- c->bfreg = &c->mdev->priv.bfreg;
+ struct mlx5e_hw_objs *hw_objs = &c->mdev->mlx5e_res.hw_objs;
+
+ /* No dedicated Ethernet doorbells, use the global one. */
+ if (hw_objs->num_bfregs == 0) {
+ c->bfreg = &c->mdev->priv.bfreg;
+ return;
+ }
+
+ /* Round-robin between doorbells. */
+ c->bfreg = hw_objs->bfregs + c->vec_ix % hw_objs->num_bfregs;
}
static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 15c434fedff7..99b34e4809ae 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -658,6 +658,8 @@ struct mlx5e_resources {
u32 pdn;
struct mlx5_td td;
u32 mkey;
+ struct mlx5_sq_bfreg *bfregs;
+ unsigned int num_bfregs;
#define MLX5_MAX_NUM_TC 8
u32 tisn[MLX5_MAX_PORTS][MLX5_MAX_NUM_TC];
bool tisn_valid;
@@ -801,6 +803,8 @@ struct mlx5_db {
int index;
};
+#define MLX5_DEFAULT_NUM_DOORBELLS 8
+
enum {
MLX5_COMP_EQ_SIZE = 1024,
};
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 08/10] net/mlx5e: Use multiple CQ doorbells
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (6 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 07/10] net/mlx5e: Use multiple TX doorbells Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:55 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 09/10] devlink: Add a 'num_doorbells' driverinit param Tariq Toukan
` (2 subsequent siblings)
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
Channel doorbells are now also used by all channel CQs.
A new 'uar' parameter is added to 'struct mlx5e_create_cq_param',
which is then used in mlx5e_alloc_cq.
A single UAR page has two TX doorbells and a single CQ doorbell, so
every consecutive pair of 'struct mlx5_sq_bfreg' (TX doorbells)
uses the same underlying 'struct mlx5_uars_page' (CQ doorbell).
So by using c->bfreg->up, CQs from every consecutive channel pair will
share the same CQ doorbell.
Non-channel associated CQs keep using the global CQ doorbell.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c | 2 ++
drivers/net/ethernet/mellanox/mlx5/core/en/trap.c | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 +-
5 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 1cbe3f3037bb..f1aa2b2ce10b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -1062,6 +1062,7 @@ struct mlx5e_create_cq_param {
struct mlx5e_ch_stats *ch_stats;
int node;
int ix;
+ struct mlx5_uars_page *uar;
};
struct mlx5e_cq_param;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index b6b4ae7c59fa..596440c8c364 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -611,6 +611,7 @@ void mlx5e_build_create_cq_param(struct mlx5e_create_cq_param *ccp, struct mlx5e
.ch_stats = c->stats,
.node = cpu_to_node(c->cpu),
.ix = c->vec_ix,
+ .uar = c->bfreg->up,
};
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
index a392578a063c..c93ee969ea64 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
@@ -578,6 +578,7 @@ static int mlx5e_ptp_open_tx_cqs(struct mlx5e_ptp *c,
ccp.ch_stats = c->stats;
ccp.napi = &c->napi;
ccp.ix = MLX5E_PTP_CHANNEL_IX;
+ ccp.uar = c->bfreg->up;
cq_param = &cparams->txq_sq_param.cqp;
@@ -627,6 +628,7 @@ static int mlx5e_ptp_open_rx_cq(struct mlx5e_ptp *c,
ccp.ch_stats = c->stats;
ccp.napi = &c->napi;
ccp.ix = MLX5E_PTP_CHANNEL_IX;
+ ccp.uar = c->bfreg->up;
cq_param = &cparams->rq_param.cqp;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
index b5c19396e096..996fcdb5a29d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/trap.c
@@ -76,6 +76,7 @@ static int mlx5e_open_trap_rq(struct mlx5e_priv *priv, struct mlx5e_trap *t)
ccp.ch_stats = t->stats;
ccp.napi = &t->napi;
ccp.ix = 0;
+ ccp.uar = mdev->priv.bfreg.up;
err = mlx5e_open_cq(priv->mdev, trap_moder, &rq_param->cqp, &ccp, &rq->cq);
if (err)
return err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 4dee4c6d048d..c22dcae9612e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2234,7 +2234,7 @@ static int mlx5e_alloc_cq(struct mlx5_core_dev *mdev,
param->eq_ix = ccp->ix;
err = mlx5e_alloc_cq_common(mdev, ccp->netdev, ccp->wq,
- mdev->priv.bfreg.up, param, cq);
+ ccp->uar, param, cq);
cq->napi = ccp->napi;
cq->ch_stats = ccp->ch_stats;
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 09/10] devlink: Add a 'num_doorbells' driverinit param
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (7 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 08/10] net/mlx5e: Use multiple CQ doorbells Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:56 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 10/10] net/mlx5e: Use the 'num_doorbells' devlink param Tariq Toukan
2025-09-18 1:40 ` [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells patchwork-bot+netdevbpf
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
This parameter can be used by drivers to configure a different number of
doorbells.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
Documentation/networking/devlink/devlink-params.rst | 3 +++
include/net/devlink.h | 4 ++++
net/devlink/param.c | 5 +++++
3 files changed, 12 insertions(+)
diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
index c51da4fba7e7..0a9c20d70122 100644
--- a/Documentation/networking/devlink/devlink-params.rst
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -148,3 +148,6 @@ own name.
- The max number of Virtual Functions (VFs) exposed by the PF.
after reboot/pci reset, 'sriov_totalvfs' entry under the device's sysfs
directory will report this value.
+ * - ``num_doorbells``
+ - u32
+ - Controls the number of doorbells used by the device.
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 8d4362f010e4..9e824f61e40f 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -531,6 +531,7 @@ enum devlink_param_generic_id {
DEVLINK_PARAM_GENERIC_ID_ENABLE_PHC,
DEVLINK_PARAM_GENERIC_ID_CLOCK_ID,
DEVLINK_PARAM_GENERIC_ID_TOTAL_VFS,
+ DEVLINK_PARAM_GENERIC_ID_NUM_DOORBELLS,
/* add new param generic ids above here*/
__DEVLINK_PARAM_GENERIC_ID_MAX,
@@ -598,6 +599,9 @@ enum devlink_param_generic_id {
#define DEVLINK_PARAM_GENERIC_TOTAL_VFS_NAME "total_vfs"
#define DEVLINK_PARAM_GENERIC_TOTAL_VFS_TYPE DEVLINK_PARAM_TYPE_U32
+#define DEVLINK_PARAM_GENERIC_NUM_DOORBELLS_NAME "num_doorbells"
+#define DEVLINK_PARAM_GENERIC_NUM_DOORBELLS_TYPE DEVLINK_PARAM_TYPE_U32
+
#define DEVLINK_PARAM_GENERIC(_id, _cmodes, _get, _set, _validate) \
{ \
.id = DEVLINK_PARAM_GENERIC_ID_##_id, \
diff --git a/net/devlink/param.c b/net/devlink/param.c
index 33134940c266..70e69523412c 100644
--- a/net/devlink/param.c
+++ b/net/devlink/param.c
@@ -107,6 +107,11 @@ static const struct devlink_param devlink_param_generic[] = {
.name = DEVLINK_PARAM_GENERIC_TOTAL_VFS_NAME,
.type = DEVLINK_PARAM_GENERIC_TOTAL_VFS_TYPE,
},
+ {
+ .id = DEVLINK_PARAM_GENERIC_ID_NUM_DOORBELLS,
+ .name = DEVLINK_PARAM_GENERIC_NUM_DOORBELLS_NAME,
+ .type = DEVLINK_PARAM_GENERIC_NUM_DOORBELLS_TYPE,
+ },
};
static int devlink_param_generic_verify(const struct devlink_param *param)
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH net-next V2 10/10] net/mlx5e: Use the 'num_doorbells' devlink param
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (8 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 09/10] devlink: Add a 'num_doorbells' driverinit param Tariq Toukan
@ 2025-09-16 14:11 ` Tariq Toukan
2025-09-17 12:56 ` Simon Horman
2025-09-18 1:40 ` [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells patchwork-bot+netdevbpf
10 siblings, 1 reply; 22+ messages in thread
From: Tariq Toukan @ 2025-09-16 14:11 UTC (permalink / raw)
To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller
Cc: Jiri Pirko, Jonathan Corbet, Leon Romanovsky, Jason Gunthorpe,
Saeed Mahameed, Tariq Toukan, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
From: Cosmin Ratiu <cratiu@nvidia.com>
Use the new devlink param to control how many doorbells mlx5e devices
allocate and use. The maximum number of doorbells configurable is capped
to the maximum number of channels. This only applies to the Ethernet
part, the RDMA devices using mlx5 manage their own doorbells.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
Documentation/networking/devlink/mlx5.rst | 9 +++++++
.../net/ethernet/mellanox/mlx5/core/devlink.c | 26 +++++++++++++++++++
.../ethernet/mellanox/mlx5/core/en_common.c | 15 ++++++++++-
3 files changed, 49 insertions(+), 1 deletion(-)
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
index 60cc9fedf1ef..41c9b716699e 100644
--- a/Documentation/networking/devlink/mlx5.rst
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -62,6 +62,15 @@ Note: permanent parameters such as ``enable_sriov`` and ``total_vfs`` require FW
echo 1 >/sys/bus/pci/rescan
grep ^ /sys/bus/pci/devices/0000:01:00.0/sriov_*
+ * - ``num_doorbells``
+ - driverinit
+ - This controls the number of channel doorbells used by the netdev. In all
+ cases, an additional doorbell is allocated and used for non-channel
+ communication (e.g. for PTP, HWS, etc.). Supported values are:
+
+ - 0: No channel-specific doorbells, use the global one for everything.
+ - [1, max_num_channels]: Spread netdev channels equally across these
+ doorbells.
The ``mlx5`` driver also implements the following driver-specific
parameters.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
index a0b68321355a..bd4cb8861218 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c
@@ -535,6 +535,25 @@ mlx5_devlink_hairpin_queue_size_validate(struct devlink *devlink, u32 id,
return 0;
}
+static int mlx5_devlink_num_doorbells_validate(struct devlink *devlink, u32 id,
+ union devlink_param_value val,
+ struct netlink_ext_ack *extack)
+{
+ struct mlx5_core_dev *mdev = devlink_priv(devlink);
+ u32 val32 = val.vu32;
+ u32 max_num_channels;
+
+ max_num_channels = mlx5e_get_max_num_channels(mdev);
+ if (val32 > max_num_channels) {
+ NL_SET_ERR_MSG_FMT_MOD(extack,
+ "Requested num_doorbells (%u) exceeds maximum number of channels (%u)",
+ val32, max_num_channels);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static void mlx5_devlink_hairpin_params_init_values(struct devlink *devlink)
{
struct mlx5_core_dev *dev = devlink_priv(devlink);
@@ -614,6 +633,9 @@ static const struct devlink_param mlx5_devlink_eth_params[] = {
"hairpin_queue_size", DEVLINK_PARAM_TYPE_U32,
BIT(DEVLINK_PARAM_CMODE_DRIVERINIT), NULL, NULL,
mlx5_devlink_hairpin_queue_size_validate),
+ DEVLINK_PARAM_GENERIC(NUM_DOORBELLS,
+ BIT(DEVLINK_PARAM_CMODE_DRIVERINIT), NULL, NULL,
+ mlx5_devlink_num_doorbells_validate),
};
static int mlx5_devlink_eth_params_register(struct devlink *devlink)
@@ -637,6 +659,10 @@ static int mlx5_devlink_eth_params_register(struct devlink *devlink)
mlx5_devlink_hairpin_params_init_values(devlink);
+ value.vu32 = MLX5_DEFAULT_NUM_DOORBELLS;
+ devl_param_driverinit_value_set(devlink,
+ DEVLINK_PARAM_GENERIC_ID_NUM_DOORBELLS,
+ value);
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index d13cebbc763a..96b744ceaf13 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -30,6 +30,7 @@
* SOFTWARE.
*/
+#include "devlink.h"
#include "en.h"
#include "lib/crypto.h"
@@ -140,6 +141,18 @@ static int mlx5e_create_tises(struct mlx5_core_dev *mdev, u32 tisn[MLX5_MAX_PORT
return err;
}
+static unsigned int
+mlx5e_get_devlink_param_num_doorbells(struct mlx5_core_dev *dev)
+{
+ const u32 param_id = DEVLINK_PARAM_GENERIC_ID_NUM_DOORBELLS;
+ struct devlink *devlink = priv_to_devlink(dev);
+ union devlink_param_value val;
+ int err;
+
+ err = devl_param_driverinit_value_get(devlink, param_id, &val);
+ return err ? MLX5_DEFAULT_NUM_DOORBELLS : val.vu32;
+}
+
int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev, bool create_tises)
{
struct mlx5e_hw_objs *res = &mdev->mlx5e_res.hw_objs;
@@ -164,7 +177,7 @@ int mlx5e_create_mdev_resources(struct mlx5_core_dev *mdev, bool create_tises)
goto err_dealloc_transport_domain;
}
- num_doorbells = min(MLX5_DEFAULT_NUM_DOORBELLS,
+ num_doorbells = min(mlx5e_get_devlink_param_num_doorbells(mdev),
mlx5e_get_max_num_channels(mdev));
res->bfregs = kcalloc(num_doorbells, sizeof(*res->bfregs), GFP_KERNEL);
if (!res->bfregs) {
--
2.31.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET
2025-09-16 14:11 ` [PATCH net-next V2 01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET Tariq Toukan
@ 2025-09-17 12:53 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:53 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:35PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> Also convert it to a simple define.
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 02/10] net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg
2025-09-16 14:11 ` [PATCH net-next V2 02/10] net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg Tariq Toukan
@ 2025-09-17 12:53 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:53 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:36PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> The 'offset' field was introduced in the original commit [1] and never
> used until commit [2], which added an unnecessary use.
>
> Remove the field and refactor the write-combining test to use a local
> variable instead.
>
> [1] commit a6d51b68611e ("net/mlx5: Introduce blue flame register
> allocator")
> [2] commit d98995b4bf98 ("net/mlx5: Reimplement write combining test")
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 03/10] net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param
2025-09-16 14:11 ` [PATCH net-next V2 03/10] net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param Tariq Toukan
@ 2025-09-17 12:53 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:53 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:37PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> This was added in commit [1], but its only use removed in commit [2].
> The parameter is unused, so remove it from the function parameter list.
>
> [1] commit 9ded70fa1d81 ("net/mlx5e: Don't prefill WQEs in XDP SQ in the
> multi buffer mode")
> [2] commit 1a9304859b3a ("net/mlx5: XDP, Enable TX side XDP multi-buffer
> support")
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 04/10] net/mlx5: Store the global doorbell in mlx5_priv
2025-09-16 14:11 ` [PATCH net-next V2 04/10] net/mlx5: Store the global doorbell in mlx5_priv Tariq Toukan
@ 2025-09-17 12:54 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:54 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:38PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> The global doorbell is used for more than just Ethernet resources, so
> move it out of mlx5e_hw_objs into a common place (mlx5_priv), to avoid
> non-Ethernet modules (e.g. HWS, ASO) depending on Ethernet structs.
>
> Use this opportunity to consolidate it with the 'uar' pointer already
> there, which was used as an RX doorbell. Underneath the 'uar' pointer is
> identical to 'bfreg->up', so store a single resource and use that
> instead.
>
> For CQ doorbells, care is taken to always use bfreg->up->index instead
> of bfreg->index, which may refer to a subsequent UAR page from the same
> ALLOC_UAR batch on some NICs.
>
> This paves the way for cleanly supporting multiple doorbells in the
> Ethernet driver.
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 05/10] net/mlx5e: Prepare for using multiple TX doorbells
2025-09-16 14:11 ` [PATCH net-next V2 05/10] net/mlx5e: Prepare for using multiple TX doorbells Tariq Toukan
@ 2025-09-17 12:54 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:54 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:39PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> The driver allocates a single doorbell per device and uses
> it for all Send Queues (SQs). This can become a bottleneck due to the
> high number of concurrent MMIO accesses when ringing the same doorbell
> from many channels.
>
> This patch makes the doorbells used by channel queues configurable.
>
> mlx5e_channel_pick_doorbell() is added to select the doorbell to be used
> for a given channel, picking the default for now.
>
> When opening a channel, the selected doorbell is saved to the channel
> struct and used whenever channel-related queues are created.
>
> Finally, 'uar_page' is added to 'struct mlx5e_create_sq_param' to
> control which doorbell to use when allocating an SQ, since that can
> happen outside channel context (e.g. for PTP).
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 06/10] net/mlx5e: Prepare for using different CQ doorbells
2025-09-16 14:11 ` [PATCH net-next V2 06/10] net/mlx5e: Prepare for using different CQ doorbells Tariq Toukan
@ 2025-09-17 12:54 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:54 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:40PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> Completion queues (CQs) in mlx5 use the same global doorbell, which may
> become contended when accessed concurrently from many cores.
>
> This patch prepares the CQ management code for supporting different
> doorbells per CQ. This will be used in downstream patches to allow
> separate doorbells to be used by channels CQs.
>
> The main change is moving the 'uar' pointer from struct mlx5_core_cq to
> struct mlx5e_cq, as the uar page to be used is better off stored
> directly there. Other users of mlx5_core_cq also store the UAR to be
> used separately and therefore the pointer being removed is dead weight
> for them. As evidence, in this patch there are two users which set the
> mcq.uar pointer but didn't use it, Software Steering and old Innova CQ
> creation code. Instead, they rang the doorbell directly from another
> pointer.
>
> The 'uar' pointer added to struct mlx5e_cq remains in a hot cacheline
> (as before), because it may get accessed for each packet.
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 07/10] net/mlx5e: Use multiple TX doorbells
2025-09-16 14:11 ` [PATCH net-next V2 07/10] net/mlx5e: Use multiple TX doorbells Tariq Toukan
@ 2025-09-17 12:55 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:55 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:41PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> First, allocate more doorbells in mlx5e_create_mdev_resources:
> - one doorbell remains 'global' and will be used by all non-channel
> associated SQs (e.g. ASO, HWS, PTP, ...).
> - allocate additional 'num_doorbells' doorbells. This defaults to
> minimum between 8 and max number of channels.
>
> mlx5e_channel_pick_doorbell() now spreads out channel SQs across
> available doorbells.
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 08/10] net/mlx5e: Use multiple CQ doorbells
2025-09-16 14:11 ` [PATCH net-next V2 08/10] net/mlx5e: Use multiple CQ doorbells Tariq Toukan
@ 2025-09-17 12:55 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:55 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:42PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> Channel doorbells are now also used by all channel CQs.
>
> A new 'uar' parameter is added to 'struct mlx5e_create_cq_param',
> which is then used in mlx5e_alloc_cq.
>
> A single UAR page has two TX doorbells and a single CQ doorbell, so
> every consecutive pair of 'struct mlx5_sq_bfreg' (TX doorbells)
> uses the same underlying 'struct mlx5_uars_page' (CQ doorbell).
> So by using c->bfreg->up, CQs from every consecutive channel pair will
> share the same CQ doorbell.
>
> Non-channel associated CQs keep using the global CQ doorbell.
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 09/10] devlink: Add a 'num_doorbells' driverinit param
2025-09-16 14:11 ` [PATCH net-next V2 09/10] devlink: Add a 'num_doorbells' driverinit param Tariq Toukan
@ 2025-09-17 12:56 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:56 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:43PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> This parameter can be used by drivers to configure a different number of
> doorbells.
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Jiri Pirko <jiri@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Thank you for exposing this via devlink.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 10/10] net/mlx5e: Use the 'num_doorbells' devlink param
2025-09-16 14:11 ` [PATCH net-next V2 10/10] net/mlx5e: Use the 'num_doorbells' devlink param Tariq Toukan
@ 2025-09-17 12:56 ` Simon Horman
0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2025-09-17 12:56 UTC (permalink / raw)
To: Tariq Toukan
Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
David S. Miller, Jiri Pirko, Jonathan Corbet, Leon Romanovsky,
Jason Gunthorpe, Saeed Mahameed, Mark Bloch, Alexei Starovoitov,
Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, Gal Pressman,
Cosmin Ratiu, Dragos Tatulea, Jiri Pirko, Jason Gunthorpe
On Tue, Sep 16, 2025 at 05:11:44PM +0300, Tariq Toukan wrote:
> From: Cosmin Ratiu <cratiu@nvidia.com>
>
> Use the new devlink param to control how many doorbells mlx5e devices
> allocate and use. The maximum number of doorbells configurable is capped
> to the maximum number of channels. This only applies to the Ethernet
> part, the RDMA devices using mlx5 manage their own doorbells.
>
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
` (9 preceding siblings ...)
2025-09-16 14:11 ` [PATCH net-next V2 10/10] net/mlx5e: Use the 'num_doorbells' devlink param Tariq Toukan
@ 2025-09-18 1:40 ` patchwork-bot+netdevbpf
10 siblings, 0 replies; 22+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-09-18 1:40 UTC (permalink / raw)
To: Tariq Toukan
Cc: edumazet, kuba, pabeni, andrew+netdev, davem, jiri, corbet, leon,
jgg, saeedm, mbloch, ast, daniel, hawk, john.fastabend, netdev,
linux-doc, linux-kernel, linux-rdma, bpf, gal, cratiu, dtatulea,
jiri, jgg
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Tue, 16 Sep 2025 17:11:34 +0300 you wrote:
> Hi,
>
> This series by Cosmin adds multiple doorbells usage in mlx5e driver.
> See detailed description by Cosmin below [1].
>
> Find V1 here:
> https://lore.kernel.org/all/1757499891-596641-1-git-send-email-tariqt@nvidia.com/
>
> [...]
Here is the summary with links:
- [net-next,V2,01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET
https://git.kernel.org/netdev/net-next/c/917449e7c3cd
- [net-next,V2,02/10] net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg
https://git.kernel.org/netdev/net-next/c/05dfe654b593
- [net-next,V2,03/10] net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param
https://git.kernel.org/netdev/net-next/c/913d28f8a71c
- [net-next,V2,04/10] net/mlx5: Store the global doorbell in mlx5_priv
https://git.kernel.org/netdev/net-next/c/aa4595d0ada6
- [net-next,V2,05/10] net/mlx5e: Prepare for using multiple TX doorbells
https://git.kernel.org/netdev/net-next/c/673d7ab7563e
- [net-next,V2,06/10] net/mlx5e: Prepare for using different CQ doorbells
https://git.kernel.org/netdev/net-next/c/a315b723e87b
- [net-next,V2,07/10] net/mlx5e: Use multiple TX doorbells
https://git.kernel.org/netdev/net-next/c/71fb4832d50b
- [net-next,V2,08/10] net/mlx5e: Use multiple CQ doorbells
https://git.kernel.org/netdev/net-next/c/325db9c6f69b
- [net-next,V2,09/10] devlink: Add a 'num_doorbells' driverinit param
https://git.kernel.org/netdev/net-next/c/6bdcb735fec6
- [net-next,V2,10/10] net/mlx5e: Use the 'num_doorbells' devlink param
https://git.kernel.org/netdev/net-next/c/11bbcfb7668c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-09-18 1:40 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-16 14:11 [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells Tariq Toukan
2025-09-16 14:11 ` [PATCH net-next V2 01/10] net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET Tariq Toukan
2025-09-17 12:53 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 02/10] net/mlx5: Remove unused 'offset' field from mlx5_sq_bfreg Tariq Toukan
2025-09-17 12:53 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 03/10] net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param Tariq Toukan
2025-09-17 12:53 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 04/10] net/mlx5: Store the global doorbell in mlx5_priv Tariq Toukan
2025-09-17 12:54 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 05/10] net/mlx5e: Prepare for using multiple TX doorbells Tariq Toukan
2025-09-17 12:54 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 06/10] net/mlx5e: Prepare for using different CQ doorbells Tariq Toukan
2025-09-17 12:54 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 07/10] net/mlx5e: Use multiple TX doorbells Tariq Toukan
2025-09-17 12:55 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 08/10] net/mlx5e: Use multiple CQ doorbells Tariq Toukan
2025-09-17 12:55 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 09/10] devlink: Add a 'num_doorbells' driverinit param Tariq Toukan
2025-09-17 12:56 ` Simon Horman
2025-09-16 14:11 ` [PATCH net-next V2 10/10] net/mlx5e: Use the 'num_doorbells' devlink param Tariq Toukan
2025-09-17 12:56 ` Simon Horman
2025-09-18 1:40 ` [PATCH net-next V2 00/10] net/mlx5e: Use multiple doorbells patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).