* [PATCH for-rc 0/3] Bugfixes for HNS RoCE
@ 2023-05-12 9:22 Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 1/3] RDMA/hns: Fix timeout attr in query qp for HIP08 Junxian Huang
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Junxian Huang @ 2023-05-12 9:22 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6
1.#1: The first patch fixes an error of queried timeout attr on HIP08.
2.#2: The second patch checks and adjusts the BT page size to ensure
successful resource allocation.
3.#3: The third patch modifies the value of long message loopback slice
to improve traffic balance.
Chengchang Tang (2):
RDMA/hns: Fix timeout attr in query qp for HIP08
RDMA/hns: Fix base address table allocation
Yangyang Li (1):
RDMA/hns: Modify the value of long message loopback slice
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 25 +++++++++----
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 +
drivers/infiniband/hw/hns/hns_roce_mr.c | 43 ++++++++++++++++++++++
3 files changed, 62 insertions(+), 8 deletions(-)
--
2.30.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH for-rc 1/3] RDMA/hns: Fix timeout attr in query qp for HIP08
2023-05-12 9:22 [PATCH for-rc 0/3] Bugfixes for HNS RoCE Junxian Huang
@ 2023-05-12 9:22 ` Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 2/3] RDMA/hns: Fix base address table allocation Junxian Huang
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Junxian Huang @ 2023-05-12 9:22 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6
From: Chengchang Tang <tangchengchang@huawei.com>
On HIP08, the queried timeout attr is different from the
timeout attr configured by the user.
It is found by rdma-core testcase test_rdmacm_async_traffic:
======================================================================
FAIL: test_rdmacm_async_traffic (tests.test_rdmacm.CMTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "./tests/test_rdmacm.py", line 33, in test_rdmacm_async_traffic
self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_traffic,
File "./tests/base.py", line 382, in two_nodes_rdmacm_traffic
raise(res)
AssertionError
Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 17 ++++++++++++++---
drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 2 ++
2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 84f1167de1d9..3a1c90406ed9 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -5012,7 +5012,6 @@ static int hns_roce_v2_set_abs_fields(struct ib_qp *ibqp,
static bool check_qp_timeout_cfg_range(struct hns_roce_dev *hr_dev, u8 *timeout)
{
#define QP_ACK_TIMEOUT_MAX_HIP08 20
-#define QP_ACK_TIMEOUT_OFFSET 10
#define QP_ACK_TIMEOUT_MAX 31
if (hr_dev->pci_dev->revision == PCI_REVISION_ID_HIP08) {
@@ -5021,7 +5020,7 @@ static bool check_qp_timeout_cfg_range(struct hns_roce_dev *hr_dev, u8 *timeout)
"local ACK timeout shall be 0 to 20.\n");
return false;
}
- *timeout += QP_ACK_TIMEOUT_OFFSET;
+ *timeout += HNS_ROCE_V2_QP_ACK_TIMEOUT_OFS_HIP08;
} else if (hr_dev->pci_dev->revision > PCI_REVISION_ID_HIP08) {
if (*timeout > QP_ACK_TIMEOUT_MAX) {
ibdev_warn(&hr_dev->ib_dev,
@@ -5307,6 +5306,18 @@ static int hns_roce_v2_query_qpc(struct hns_roce_dev *hr_dev, u32 qpn,
return ret;
}
+static u8 get_qp_timeout_attr(struct hns_roce_dev *hr_dev,
+ struct hns_roce_v2_qp_context *context)
+{
+ u8 timeout;
+
+ timeout = (u8)hr_reg_read(context, QPC_AT);
+ if (hr_dev->pci_dev->revision == PCI_REVISION_ID_HIP08)
+ timeout -= HNS_ROCE_V2_QP_ACK_TIMEOUT_OFS_HIP08;
+
+ return timeout;
+}
+
static int hns_roce_v2_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr)
@@ -5384,7 +5395,7 @@ static int hns_roce_v2_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
qp_attr->max_dest_rd_atomic = 1 << hr_reg_read(&context, QPC_RR_MAX);
qp_attr->min_rnr_timer = (u8)hr_reg_read(&context, QPC_MIN_RNR_TIME);
- qp_attr->timeout = (u8)hr_reg_read(&context, QPC_AT);
+ qp_attr->timeout = get_qp_timeout_attr(hr_dev, &context);
qp_attr->retry_cnt = hr_reg_read(&context, QPC_RETRY_NUM_INIT);
qp_attr->rnr_retry = hr_reg_read(&context, QPC_RNR_NUM_INIT);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
index 1b44d2434ab4..7033eae2407c 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.h
@@ -44,6 +44,8 @@
#define HNS_ROCE_V2_MAX_XRCD_NUM 0x1000000
#define HNS_ROCE_V2_RSV_XRCD_NUM 0
+#define HNS_ROCE_V2_QP_ACK_TIMEOUT_OFS_HIP08 10
+
#define HNS_ROCE_V3_SCCC_SZ 64
#define HNS_ROCE_V3_GMV_ENTRY_SZ 32
--
2.30.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH for-rc 2/3] RDMA/hns: Fix base address table allocation
2023-05-12 9:22 [PATCH for-rc 0/3] Bugfixes for HNS RoCE Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 1/3] RDMA/hns: Fix timeout attr in query qp for HIP08 Junxian Huang
@ 2023-05-12 9:22 ` Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 3/3] RDMA/hns: Modify the value of long message loopback slice Junxian Huang
2023-05-17 19:11 ` [PATCH for-rc 0/3] Bugfixes for HNS RoCE Jason Gunthorpe
3 siblings, 0 replies; 5+ messages in thread
From: Junxian Huang @ 2023-05-12 9:22 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6
From: Chengchang Tang <tangchengchang@huawei.com>
For hns, the specification of an entry like resource (E.g. WQE/CQE/EQE)
depends on BT page size, buf page size and hopnum. For user mode, the buf
page size depends on UMEM. Therefore, the actual specification is
controlled by BT page size and hopnum.
The current BT page size and hopnum are obtained from firmware. This
makes the driver inflexible and introduces unnecessary constraints.
Resource allocation failures occur in many scenarios.
This patch will calculate whether the BT page size set by firmware is
sufficient before allocating BT, and increase the BT page size if it is
insufficient.
Fixes: 1133401412a9 ("RDMA/hns: Optimize base address table config flow for qp buffer")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
drivers/infiniband/hw/hns/hns_roce_mr.c | 43 +++++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c
index 37a5cf62f88b..14376490ac22 100644
--- a/drivers/infiniband/hw/hns/hns_roce_mr.c
+++ b/drivers/infiniband/hw/hns/hns_roce_mr.c
@@ -33,6 +33,7 @@
#include <linux/vmalloc.h>
#include <rdma/ib_umem.h>
+#include <linux/math.h>
#include "hns_roce_device.h"
#include "hns_roce_cmd.h"
#include "hns_roce_hem.h"
@@ -909,6 +910,44 @@ static int mtr_init_buf_cfg(struct hns_roce_dev *hr_dev,
return page_cnt;
}
+static u64 cal_pages_per_l1ba(unsigned int ba_per_bt, unsigned int hopnum)
+{
+ return int_pow(ba_per_bt, hopnum - 1);
+}
+
+static unsigned int cal_best_bt_pg_sz(struct hns_roce_dev *hr_dev,
+ struct hns_roce_mtr *mtr,
+ unsigned int pg_shift)
+{
+ unsigned long cap = hr_dev->caps.page_size_cap;
+ struct hns_roce_buf_region *re;
+ unsigned int pgs_per_l1ba;
+ unsigned int ba_per_bt;
+ unsigned int ba_num;
+ int i;
+
+ for_each_set_bit_from(pg_shift, &cap, sizeof(cap) * BITS_PER_BYTE) {
+ if (!(BIT(pg_shift) & cap))
+ continue;
+
+ ba_per_bt = BIT(pg_shift) / BA_BYTE_LEN;
+ ba_num = 0;
+ for (i = 0; i < mtr->hem_cfg.region_count; i++) {
+ re = &mtr->hem_cfg.region[i];
+ if (re->hopnum == 0)
+ continue;
+
+ pgs_per_l1ba = cal_pages_per_l1ba(ba_per_bt, re->hopnum);
+ ba_num += DIV_ROUND_UP(re->count, pgs_per_l1ba);
+ }
+
+ if (ba_num <= ba_per_bt)
+ return pg_shift;
+ }
+
+ return 0;
+}
+
static int mtr_alloc_mtt(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
unsigned int ba_page_shift)
{
@@ -917,6 +956,10 @@ static int mtr_alloc_mtt(struct hns_roce_dev *hr_dev, struct hns_roce_mtr *mtr,
hns_roce_hem_list_init(&mtr->hem_list);
if (!cfg->is_direct) {
+ ba_page_shift = cal_best_bt_pg_sz(hr_dev, mtr, ba_page_shift);
+ if (!ba_page_shift)
+ return -ERANGE;
+
ret = hns_roce_hem_list_request(hr_dev, &mtr->hem_list,
cfg->region, cfg->region_count,
ba_page_shift);
--
2.30.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH for-rc 3/3] RDMA/hns: Modify the value of long message loopback slice
2023-05-12 9:22 [PATCH for-rc 0/3] Bugfixes for HNS RoCE Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 1/3] RDMA/hns: Fix timeout attr in query qp for HIP08 Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 2/3] RDMA/hns: Fix base address table allocation Junxian Huang
@ 2023-05-12 9:22 ` Junxian Huang
2023-05-17 19:11 ` [PATCH for-rc 0/3] Bugfixes for HNS RoCE Jason Gunthorpe
3 siblings, 0 replies; 5+ messages in thread
From: Junxian Huang @ 2023-05-12 9:22 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, linuxarm, linux-kernel, huangjunxian6
From: Yangyang Li <liyangyang20@huawei.com>
Long message loopback slice is used for achieving traffic balance
between QPs. It prevents the problem that QPs with large traffic
occupying the hardware pipeline for a long time and QPs with small
traffic cannot be scheduled.
Currently, its maximum value is set to 16K, which means only after a QP
sends 16K will the second QP be scheduled. This value is too large, which
will lead to unbalanced traffic scheduling, and thus it needs to be
modified.
The setting range of the long message loopback slice is modified to be
from 1024 (the lower limit supported by hardware) to mtu. Actual testing
shows that this value can significantly reduce error in hardware traffic
scheduling.
This solution is compatible with both HIP08 and HIP09. The modified
lp_pktn_ini has a maximum value of 2 (when mtu is 256), so the range
checking code for lp_pktn_ini is no longer necessary and needs to
be deleted.
Fixes: 0e60778efb07 ("RDMA/hns: Modify the value of MAX_LP_MSG_LEN to meet hardware compatibility")
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 3a1c90406ed9..d4c6b9bc0a4e 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -4583,11 +4583,9 @@ static int modify_qp_init_to_rtr(struct ib_qp *ibqp,
mtu = ib_mtu_enum_to_int(ib_mtu);
if (WARN_ON(mtu <= 0))
return -EINVAL;
-#define MAX_LP_MSG_LEN 16384
- /* MTU * (2 ^ LP_PKTN_INI) shouldn't be bigger than 16KB */
- lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu);
- if (WARN_ON(lp_pktn_ini >= 0xF))
- return -EINVAL;
+#define MIN_LP_MSG_LEN 1024
+ /* mtu * (2 ^ lp_pktn_ini) should be in the range of 1024 to mtu */
+ lp_pktn_ini = ilog2(max(mtu, MIN_LP_MSG_LEN) / mtu);
if (attr_mask & IB_QP_PATH_MTU) {
hr_reg_write(context, QPC_MTU, ib_mtu);
--
2.30.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH for-rc 0/3] Bugfixes for HNS RoCE
2023-05-12 9:22 [PATCH for-rc 0/3] Bugfixes for HNS RoCE Junxian Huang
` (2 preceding siblings ...)
2023-05-12 9:22 ` [PATCH for-rc 3/3] RDMA/hns: Modify the value of long message loopback slice Junxian Huang
@ 2023-05-17 19:11 ` Jason Gunthorpe
3 siblings, 0 replies; 5+ messages in thread
From: Jason Gunthorpe @ 2023-05-17 19:11 UTC (permalink / raw)
To: Junxian Huang; +Cc: leon, linux-rdma, linuxarm, linux-kernel
On Fri, May 12, 2023 at 05:22:42PM +0800, Junxian Huang wrote:
> 1.#1: The first patch fixes an error of queried timeout attr on HIP08.
>
> 2.#2: The second patch checks and adjusts the BT page size to ensure
> successful resource allocation.
>
> 3.#3: The third patch modifies the value of long message loopback slice
> to improve traffic balance.
>
> Chengchang Tang (2):
> RDMA/hns: Fix timeout attr in query qp for HIP08
> RDMA/hns: Fix base address table allocation
>
> Yangyang Li (1):
> RDMA/hns: Modify the value of long message loopback slice
Applied to for-rc, thanks
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-17 19:11 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-12 9:22 [PATCH for-rc 0/3] Bugfixes for HNS RoCE Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 1/3] RDMA/hns: Fix timeout attr in query qp for HIP08 Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 2/3] RDMA/hns: Fix base address table allocation Junxian Huang
2023-05-12 9:22 ` [PATCH for-rc 3/3] RDMA/hns: Modify the value of long message loopback slice Junxian Huang
2023-05-17 19:11 ` [PATCH for-rc 0/3] Bugfixes for HNS RoCE Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).