* [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes
@ 2022-12-28 12:56 Leon Romanovsky
2022-12-28 12:56 ` [PATCH RESEND rdma-next 1/2] RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device Leon Romanovsky
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Leon Romanovsky @ 2022-12-28 12:56 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, linux-rdma, Maor Gottlieb, Patrisious Haddad,
Shay Drory
From: Leon Romanovsky <leonro@nvidia.com>
Hi,
This was already posted to ML, but too late to be included in last pull
request to Linus, so simply resending them.
Thanks
Maor Gottlieb (1):
RDMA/mlx5: Fix validation of max_rd_atomic caps for DC
Shay Drory (1):
RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device
drivers/infiniband/hw/mlx5/counters.c | 6 ++--
drivers/infiniband/hw/mlx5/qp.c | 49 +++++++++++++++++++--------
2 files changed, 38 insertions(+), 17 deletions(-)
--
2.38.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH RESEND rdma-next 1/2] RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device
2022-12-28 12:56 [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes Leon Romanovsky
@ 2022-12-28 12:56 ` Leon Romanovsky
2022-12-28 12:56 ` [PATCH RESEND rdma-next 2/2] RDMA/mlx5: Fix validation of max_rd_atomic caps for DC Leon Romanovsky
2023-01-01 8:59 ` [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes Leon Romanovsky
2 siblings, 0 replies; 4+ messages in thread
From: Leon Romanovsky @ 2022-12-28 12:56 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Shay Drory, linux-rdma, Patrisious Haddad
From: Shay Drory <shayd@nvidia.com>
Currently, when mlx5_ib_get_hw_stats() is used for device (port_num = 0),
there is a special handling in order to use the correct counters, but,
port_num is being passed down the stack without any change.
Also, some functions assume that port_num >=1. As a result, the
following oops can occur.
BUG: unable to handle page fault for address: ffff89510294f1a8
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP
CPU: 8 PID: 1382 Comm: devlink Tainted: G W 6.1.0-rc4_for_upstream_base_2022_11_10_16_12 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:_raw_spin_lock+0xc/0x20
Call Trace:
<TASK>
mlx5_ib_get_native_port_mdev+0x73/0xe0 [mlx5_ib]
do_get_hw_stats.constprop.0+0x109/0x160 [mlx5_ib]
mlx5_ib_get_hw_stats+0xad/0x180 [mlx5_ib]
ib_setup_device_attrs+0xf0/0x290 [ib_core]
ib_register_device+0x3bb/0x510 [ib_core]
? atomic_notifier_chain_register+0x67/0x80
__mlx5_ib_add+0x2b/0x80 [mlx5_ib]
mlx5r_probe+0xb8/0x150 [mlx5_ib]
? auxiliary_match_id+0x6a/0x90
auxiliary_bus_probe+0x3c/0x70
? driver_sysfs_add+0x6b/0x90
really_probe+0xcd/0x380
__driver_probe_device+0x80/0x170
driver_probe_device+0x1e/0x90
__device_attach_driver+0x7d/0x100
? driver_allows_async_probing+0x60/0x60
? driver_allows_async_probing+0x60/0x60
bus_for_each_drv+0x7b/0xc0
__device_attach+0xbc/0x200
bus_probe_device+0x87/0xa0
device_add+0x404/0x940
? dev_set_name+0x53/0x70
__auxiliary_device_add+0x43/0x60
add_adev+0x99/0xe0 [mlx5_core]
mlx5_attach_device+0xc8/0x120 [mlx5_core]
mlx5_load_one_devl_locked+0xb2/0xe0 [mlx5_core]
devlink_reload+0x133/0x250
devlink_nl_cmd_reload+0x480/0x570
? devlink_nl_pre_doit+0x44/0x2b0
genl_family_rcv_msg_doit.isra.0+0xc2/0x110
genl_rcv_msg+0x180/0x2b0
? devlink_nl_cmd_region_read_dumpit+0x540/0x540
? devlink_reload+0x250/0x250
? devlink_put+0x50/0x50
? genl_family_rcv_msg_doit.isra.0+0x110/0x110
netlink_rcv_skb+0x54/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x1f6/0x2c0
netlink_sendmsg+0x237/0x490
sock_sendmsg+0x33/0x40
__sys_sendto+0x103/0x160
? handle_mm_fault+0x10e/0x290
? do_user_addr_fault+0x1c0/0x5f0
__x64_sys_sendto+0x25/0x30
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fix it by setting port_num to 1 in order to get device status and
remove unused variable.
Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/ab402f83f04f4a41ffb177583609909c86cef52a.1670749789.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
---
drivers/infiniband/hw/mlx5/counters.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/counters.c b/drivers/infiniband/hw/mlx5/counters.c
index 945758f39523..3e1272695d99 100644
--- a/drivers/infiniband/hw/mlx5/counters.c
+++ b/drivers/infiniband/hw/mlx5/counters.c
@@ -278,7 +278,6 @@ static int do_get_hw_stats(struct ib_device *ibdev,
const struct mlx5_ib_counters *cnts = get_counters(dev, port_num - 1);
struct mlx5_core_dev *mdev;
int ret, num_counters;
- u32 mdev_port_num;
if (!stats)
return -EINVAL;
@@ -299,8 +298,9 @@ static int do_get_hw_stats(struct ib_device *ibdev,
}
if (MLX5_CAP_GEN(dev->mdev, cc_query_allowed)) {
- mdev = mlx5_ib_get_native_port_mdev(dev, port_num,
- &mdev_port_num);
+ if (!port_num)
+ port_num = 1;
+ mdev = mlx5_ib_get_native_port_mdev(dev, port_num, NULL);
if (!mdev) {
/* If port is not affiliated yet, its in down state
* which doesn't have any counters yet, so it would be
--
2.38.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH RESEND rdma-next 2/2] RDMA/mlx5: Fix validation of max_rd_atomic caps for DC
2022-12-28 12:56 [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes Leon Romanovsky
2022-12-28 12:56 ` [PATCH RESEND rdma-next 1/2] RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device Leon Romanovsky
@ 2022-12-28 12:56 ` Leon Romanovsky
2023-01-01 8:59 ` [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes Leon Romanovsky
2 siblings, 0 replies; 4+ messages in thread
From: Leon Romanovsky @ 2022-12-28 12:56 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Maor Gottlieb, linux-rdma, Patrisious Haddad, Shay Drory
From: Maor Gottlieb <maorg@nvidia.com>
Currently, when modifying DC, we validate max_rd_atomic
user attribute against the RC cap, fix it to validate against DC.
Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Link: https://lore.kernel.org/r/193aa04bce4609df7d86250da3e2886f26f266cf.1670749789.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
---
drivers/infiniband/hw/mlx5/qp.c | 49 +++++++++++++++++++++++----------
1 file changed, 35 insertions(+), 14 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 40d9410ec303..cf953d23d18d 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4502,6 +4502,40 @@ static bool mlx5_ib_modify_qp_allowed(struct mlx5_ib_dev *dev,
return false;
}
+static int validate_rd_atomic(struct mlx5_ib_dev *dev, struct ib_qp_attr *attr,
+ int attr_mask, enum ib_qp_type qp_type)
+{
+ int log_max_ra_res;
+ int log_max_ra_req;
+
+ if (qp_type == MLX5_IB_QPT_DCI) {
+ log_max_ra_res = 1 << MLX5_CAP_GEN(dev->mdev,
+ log_max_ra_res_dc);
+ log_max_ra_req = 1 << MLX5_CAP_GEN(dev->mdev,
+ log_max_ra_req_dc);
+ } else {
+ log_max_ra_res = 1 << MLX5_CAP_GEN(dev->mdev,
+ log_max_ra_res_qp);
+ log_max_ra_req = 1 << MLX5_CAP_GEN(dev->mdev,
+ log_max_ra_req_qp);
+ }
+
+ if (attr_mask & IB_QP_MAX_QP_RD_ATOMIC &&
+ attr->max_rd_atomic > log_max_ra_res) {
+ mlx5_ib_dbg(dev, "invalid max_rd_atomic value %d\n",
+ attr->max_rd_atomic);
+ return false;
+ }
+
+ if (attr_mask & IB_QP_MAX_DEST_RD_ATOMIC &&
+ attr->max_dest_rd_atomic > log_max_ra_req) {
+ mlx5_ib_dbg(dev, "invalid max_dest_rd_atomic value %d\n",
+ attr->max_dest_rd_atomic);
+ return false;
+ }
+ return true;
+}
+
int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
int attr_mask, struct ib_udata *udata)
{
@@ -4589,21 +4623,8 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
goto out;
}
- if (attr_mask & IB_QP_MAX_QP_RD_ATOMIC &&
- attr->max_rd_atomic >
- (1 << MLX5_CAP_GEN(dev->mdev, log_max_ra_res_qp))) {
- mlx5_ib_dbg(dev, "invalid max_rd_atomic value %d\n",
- attr->max_rd_atomic);
- goto out;
- }
-
- if (attr_mask & IB_QP_MAX_DEST_RD_ATOMIC &&
- attr->max_dest_rd_atomic >
- (1 << MLX5_CAP_GEN(dev->mdev, log_max_ra_req_qp))) {
- mlx5_ib_dbg(dev, "invalid max_dest_rd_atomic value %d\n",
- attr->max_dest_rd_atomic);
+ if (!validate_rd_atomic(dev, attr, attr_mask, qp_type))
goto out;
- }
if (cur_state == new_state && cur_state == IB_QPS_RESET) {
err = 0;
--
2.38.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes
2022-12-28 12:56 [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes Leon Romanovsky
2022-12-28 12:56 ` [PATCH RESEND rdma-next 1/2] RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device Leon Romanovsky
2022-12-28 12:56 ` [PATCH RESEND rdma-next 2/2] RDMA/mlx5: Fix validation of max_rd_atomic caps for DC Leon Romanovsky
@ 2023-01-01 8:59 ` Leon Romanovsky
2 siblings, 0 replies; 4+ messages in thread
From: Leon Romanovsky @ 2023-01-01 8:59 UTC (permalink / raw)
To: Leon Romanovsky, Jason Gunthorpe
Cc: Shay Drory, Leon Romanovsky, Patrisious Haddad, linux-rdma,
Maor Gottlieb
On Wed, 28 Dec 2022 14:56:08 +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> Hi,
>
> This was already posted to ML, but too late to be included in last pull
> request to Linus, so simply resending them.
>
> [...]
Applied, thanks!
[1/2] RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device
(no commit info)
[2/2] RDMA/mlx5: Fix validation of max_rd_atomic caps for DC
(no commit info)
Best regards,
--
Leon Romanovsky <leon@kernel.org>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-01-01 8:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-28 12:56 [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes Leon Romanovsky
2022-12-28 12:56 ` [PATCH RESEND rdma-next 1/2] RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device Leon Romanovsky
2022-12-28 12:56 ` [PATCH RESEND rdma-next 2/2] RDMA/mlx5: Fix validation of max_rd_atomic caps for DC Leon Romanovsky
2023-01-01 8:59 ` [PATCH RESEND rdma-next 0/2] Two mlx5_ib fixes Leon Romanovsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.