* [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE)
@ 2020-03-05 15:00 Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 1/9] RDMA/cm: Add Enhanced Connection Establishment (ECE) bits Leon Romanovsky
` (8 more replies)
0 siblings, 9 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:00 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe
Cc: Leon Romanovsky, Gal Pressman, linux-kernel, linux-rdma,
Mark Zhang, Yishai Hadas
From: Leon Romanovsky <leonro@mellanox.com>
Enhanced Connection Established or ECE is new negotiation scheme
introduced in IBTA v1.4 to exchange extra information about nodes
capabilities and later negotiate them at the connection establishment
phase.
The RDMA-CM messages (REQ, REP, SIDR_REQ and SIDR_REP) were extended
to carry two fields, one new and another gained new functionality:
* VendorID is a new field that indicates that common subset of vendor
option bits are supported as indicated by that VendorID.
* AttributeModifier already exists, but overloaded to indicate which
vendor options are supported by this VendorID.
This is kernel part of such functionality which is responsible to get data
from librdmacm and properly create and handle RDMA-CM messages.
Thanks
Leon Romanovsky (9):
RDMA/cm: Add Enhanced Connection Establishment (ECE) bits
RDMA: Promote field_avail() macro to be general code
RDMA/cm: Delete not implemented CM peer to peer communication
RDMA/uapi: Add ECE definitions to UCMA
RDMA/ucma: Extend ucma_connect to receive ECE parameters
RDMA/ucma: Deliver ECE parameters through UCMA events
RDMA/cm: Send and receive ECE parameter over the wire
RDMA/cma: Connect ECE to rdma_accept
RDMA/cma: Provide ECE reject reason
drivers/infiniband/core/cm.c | 48 ++++++++++++++++++------
drivers/infiniband/core/cma.c | 54 ++++++++++++++++++++++++---
drivers/infiniband/core/cma_priv.h | 1 +
drivers/infiniband/core/ucma.c | 40 ++++++++++++++++----
drivers/infiniband/hw/efa/efa_verbs.c | 3 --
drivers/infiniband/hw/mlx4/main.c | 7 +---
drivers/infiniband/hw/mlx5/main.c | 41 ++++++++++----------
drivers/infiniband/hw/mlx5/mlx5_ib.h | 18 ++++-----
include/linux/kernel.h | 18 +++++++++
include/rdma/ib_cm.h | 11 +++++-
include/rdma/ibta_vol1_c12.h | 6 +++
include/rdma/rdma_cm.h | 28 ++++++++++++--
include/uapi/rdma/rdma_user_cm.h | 15 +++++++-
13 files changed, 219 insertions(+), 71 deletions(-)
--
2.24.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH rdma-next 1/9] RDMA/cm: Add Enhanced Connection Establishment (ECE) bits
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
@ 2020-03-05 15:00 ` Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 2/9] RDMA: Promote field_avail() macro to be general code Leon Romanovsky
` (7 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:00 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
Extend REQ (request for communications), REP (reply to request
for communication), rejected reason and SIDR_REP (service ID
resolution response) structures with hardware vendor ID bits
according to approved IBA Comment #9434.
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
include/rdma/ibta_vol1_c12.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/rdma/ibta_vol1_c12.h b/include/rdma/ibta_vol1_c12.h
index 269904425d3f..960c86bec76c 100644
--- a/include/rdma/ibta_vol1_c12.h
+++ b/include/rdma/ibta_vol1_c12.h
@@ -38,6 +38,7 @@
/* Table 106 REQ Message Contents */
#define CM_REQ_LOCAL_COMM_ID CM_FIELD32_LOC(struct cm_req_msg, 0, 32)
+#define CM_REQ_VENDOR_ID CM_FIELD32_LOC(struct cm_req_msg, 5, 24)
#define CM_REQ_SERVICE_ID CM_FIELD64_LOC(struct cm_req_msg, 8)
#define CM_REQ_LOCAL_CA_GUID CM_FIELD64_LOC(struct cm_req_msg, 16)
#define CM_REQ_LOCAL_Q_KEY CM_FIELD32_LOC(struct cm_req_msg, 28, 32)
@@ -119,8 +120,11 @@ CM_STRUCT(struct cm_rej_msg, 84 * 8 + 1184);
#define CM_REP_REMOTE_COMM_ID CM_FIELD32_LOC(struct cm_rep_msg, 4, 32)
#define CM_REP_LOCAL_Q_KEY CM_FIELD32_LOC(struct cm_rep_msg, 8, 32)
#define CM_REP_LOCAL_QPN CM_FIELD32_LOC(struct cm_rep_msg, 12, 24)
+#define CM_REP_VENDOR_ID_H CM_FIELD8_LOC(struct cm_rep_msg, 15, 8)
#define CM_REP_LOCAL_EE_CONTEXT_NUMBER CM_FIELD32_LOC(struct cm_rep_msg, 16, 24)
+#define CM_REP_VENDOR_ID_M CM_FIELD8_LOC(struct cm_rep_msg, 19, 8)
#define CM_REP_STARTING_PSN CM_FIELD32_LOC(struct cm_rep_msg, 20, 24)
+#define CM_REP_VENDOR_ID_L CM_FIELD8_LOC(struct cm_rep_msg, 23, 8)
#define CM_REP_RESPONDER_RESOURCES CM_FIELD8_LOC(struct cm_rep_msg, 24, 8)
#define CM_REP_INITIATOR_DEPTH CM_FIELD8_LOC(struct cm_rep_msg, 25, 8)
#define CM_REP_TARGET_ACK_DELAY CM_FIELD8_LOC(struct cm_rep_msg, 26, 5)
@@ -201,7 +205,9 @@ CM_STRUCT(struct cm_sidr_req_msg, 16 * 8 + 1728);
#define CM_SIDR_REP_STATUS CM_FIELD8_LOC(struct cm_sidr_rep_msg, 4, 8)
#define CM_SIDR_REP_ADDITIONAL_INFORMATION_LENGTH \
CM_FIELD8_LOC(struct cm_sidr_rep_msg, 5, 8)
+#define CM_SIDR_REP_VENDOR_ID_H CM_FIELD16_LOC(struct cm_sidr_rep_msg, 6, 16)
#define CM_SIDR_REP_QPN CM_FIELD32_LOC(struct cm_sidr_rep_msg, 8, 24)
+#define CM_SIDR_REP_VENDOR_ID_L CM_FIELD8_LOC(struct cm_sidr_rep_msg, 11, 8)
#define CM_SIDR_REP_SERVICEID CM_FIELD64_LOC(struct cm_sidr_rep_msg, 12)
#define CM_SIDR_REP_Q_KEY CM_FIELD32_LOC(struct cm_sidr_rep_msg, 20, 32)
#define CM_SIDR_REP_ADDITIONAL_INFORMATION \
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 2/9] RDMA: Promote field_avail() macro to be general code
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 1/9] RDMA/cm: Add Enhanced Connection Establishment (ECE) bits Leon Romanovsky
@ 2020-03-05 15:00 ` Leon Romanovsky
2020-03-05 15:18 ` Jason Gunthorpe
2020-03-05 15:00 ` [PATCH rdma-next 3/9] RDMA/cm: Delete not implemented CM peer to peer communication Leon Romanovsky
` (6 subsequent siblings)
8 siblings, 1 reply; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:00 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe
Cc: Leon Romanovsky, Gal Pressman, linux-rdma, Mark Zhang,
Yishai Hadas
From: Leon Romanovsky <leonro@mellanox.com>
Main usage of such macro is to check user <-> kernel communication
in order to ensure that user supplied enough space to read/write
tested field.
The field_avail() macro is used by several RDMA drivers and in next
patches will be used for RDMA-CM code as well, so being general enough,
let's promote the code to in kernel.h
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
---
drivers/infiniband/hw/efa/efa_verbs.c | 3 --
drivers/infiniband/hw/mlx4/main.c | 7 ++---
drivers/infiniband/hw/mlx5/main.c | 41 +++++++++++++--------------
drivers/infiniband/hw/mlx5/mlx5_ib.h | 18 ++++++------
include/linux/kernel.h | 18 ++++++++++++
5 files changed, 48 insertions(+), 39 deletions(-)
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index bf3120f140f7..ce89428f5af5 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -144,9 +144,6 @@ static inline bool is_rdma_read_cap(struct efa_dev *dev)
return dev->dev_attr.device_caps & EFA_ADMIN_FEATURE_DEVICE_ATTR_DESC_RDMA_READ_MASK;
}
-#define field_avail(x, fld, sz) (offsetof(typeof(x), fld) + \
- sizeof_field(typeof(x), fld) <= (sz))
-
#define is_reserved_cleared(reserved) \
!memchr_inv(reserved, 0, sizeof(reserved))
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 2f5d9b181848..d63743d96196 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -434,9 +434,6 @@ int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
return real_index;
}
-#define field_avail(type, fld, sz) (offsetof(type, fld) + \
- sizeof(((type *)0)->fld) <= (sz))
-
static int mlx4_ib_query_device(struct ib_device *ibdev,
struct ib_device_attr *props,
struct ib_udata *uhw)
@@ -602,7 +599,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
sizeof(struct mlx4_wqe_data_seg);
}
- if (field_avail(typeof(resp), rss_caps, uhw->outlen)) {
+ if (field_avail(resp, rss_caps, uhw->outlen)) {
if (props->rss_caps.supported_qpts) {
resp.rss_caps.rx_hash_function =
MLX4_IB_RX_HASH_FUNC_TOEPLITZ;
@@ -626,7 +623,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
sizeof(resp.rss_caps);
}
- if (field_avail(typeof(resp), tso_caps, uhw->outlen)) {
+ if (field_avail(resp, tso_caps, uhw->outlen)) {
if (dev->dev->caps.max_gso_sz &&
((mlx4_ib_port_link_layer(ibdev, 1) ==
IB_LINK_LAYER_ETHERNET) ||
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 709ef3f57a06..879664797a80 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -898,7 +898,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
props->raw_packet_caps |=
IB_RAW_PACKET_CAP_CVLAN_STRIPPING;
- if (field_avail(typeof(resp), tso_caps, uhw_outlen)) {
+ if (field_avail(resp, tso_caps, uhw_outlen)) {
max_tso = MLX5_CAP_ETH(mdev, max_lso_cap);
if (max_tso) {
resp.tso_caps.max_tso = 1 << max_tso;
@@ -908,7 +908,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
}
}
- if (field_avail(typeof(resp), rss_caps, uhw_outlen)) {
+ if (field_avail(resp, rss_caps, uhw_outlen)) {
resp.rss_caps.rx_hash_function =
MLX5_RX_HASH_FUNC_TOEPLITZ;
resp.rss_caps.rx_hash_fields_mask =
@@ -928,9 +928,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
resp.response_length += sizeof(resp.rss_caps);
}
} else {
- if (field_avail(typeof(resp), tso_caps, uhw_outlen))
+ if (field_avail(resp, tso_caps, uhw_outlen))
resp.response_length += sizeof(resp.tso_caps);
- if (field_avail(typeof(resp), rss_caps, uhw_outlen))
+ if (field_avail(resp, rss_caps, uhw_outlen))
resp.response_length += sizeof(resp.rss_caps);
}
@@ -1072,7 +1072,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
MLX5_MAX_CQ_PERIOD;
}
- if (field_avail(typeof(resp), cqe_comp_caps, uhw_outlen)) {
+ if (field_avail(resp, cqe_comp_caps, uhw_outlen)) {
resp.response_length += sizeof(resp.cqe_comp_caps);
if (MLX5_CAP_GEN(dev->mdev, cqe_compression)) {
@@ -1090,8 +1090,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
}
}
- if (field_avail(typeof(resp), packet_pacing_caps, uhw_outlen) &&
- raw_support) {
+ if (field_avail(resp, packet_pacing_caps, uhw_outlen) && raw_support) {
if (MLX5_CAP_QOS(mdev, packet_pacing) &&
MLX5_CAP_GEN(mdev, qos)) {
resp.packet_pacing_caps.qp_rate_limit_max =
@@ -1108,7 +1107,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
resp.response_length += sizeof(resp.packet_pacing_caps);
}
- if (field_avail(typeof(resp), mlx5_ib_support_multi_pkt_send_wqes,
+ if (field_avail(resp, mlx5_ib_support_multi_pkt_send_wqes,
uhw_outlen)) {
if (MLX5_CAP_ETH(mdev, multi_pkt_send_wqe))
resp.mlx5_ib_support_multi_pkt_send_wqes =
@@ -1122,7 +1121,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
sizeof(resp.mlx5_ib_support_multi_pkt_send_wqes);
}
- if (field_avail(typeof(resp), flags, uhw_outlen)) {
+ if (field_avail(resp, flags, uhw_outlen)) {
resp.response_length += sizeof(resp.flags);
if (MLX5_CAP_GEN(mdev, cqe_compression_128))
@@ -1138,7 +1137,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
resp.flags |= MLX5_IB_QUERY_DEV_RESP_FLAGS_SCAT2CQE_DCT;
}
- if (field_avail(typeof(resp), sw_parsing_caps, uhw_outlen)) {
+ if (field_avail(resp, sw_parsing_caps, uhw_outlen)) {
resp.response_length += sizeof(resp.sw_parsing_caps);
if (MLX5_CAP_ETH(mdev, swp)) {
resp.sw_parsing_caps.sw_parsing_offloads |=
@@ -1158,7 +1157,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
}
}
- if (field_avail(typeof(resp), striding_rq_caps, uhw_outlen) &&
+ if (field_avail(resp, striding_rq_caps, uhw_outlen) &&
raw_support) {
resp.response_length += sizeof(resp.striding_rq_caps);
if (MLX5_CAP_GEN(mdev, striding_rq)) {
@@ -1181,7 +1180,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
}
}
- if (field_avail(typeof(resp), tunnel_offloads_caps, uhw_outlen)) {
+ if (field_avail(resp, tunnel_offloads_caps, uhw_outlen)) {
resp.response_length += sizeof(resp.tunnel_offloads_caps);
if (MLX5_CAP_ETH(mdev, tunnel_stateless_vxlan))
resp.tunnel_offloads_caps |=
@@ -1901,16 +1900,16 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx,
resp.tot_bfregs = req.total_num_bfregs;
resp.num_ports = dev->num_ports;
- if (field_avail(typeof(resp), cqe_version, udata->outlen))
+ if (field_avail(resp, cqe_version, udata->outlen))
resp.response_length += sizeof(resp.cqe_version);
- if (field_avail(typeof(resp), cmds_supp_uhw, udata->outlen)) {
+ if (field_avail(resp, cmds_supp_uhw, udata->outlen)) {
resp.cmds_supp_uhw |= MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE |
MLX5_USER_CMDS_SUPP_UHW_CREATE_AH;
resp.response_length += sizeof(resp.cmds_supp_uhw);
}
- if (field_avail(typeof(resp), eth_min_inline, udata->outlen)) {
+ if (field_avail(resp, eth_min_inline, udata->outlen)) {
if (mlx5_ib_port_link_layer(ibdev, 1) == IB_LINK_LAYER_ETHERNET) {
mlx5_query_min_inline(dev->mdev, &resp.eth_min_inline);
resp.eth_min_inline++;
@@ -1918,7 +1917,7 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx,
resp.response_length += sizeof(resp.eth_min_inline);
}
- if (field_avail(typeof(resp), clock_info_versions, udata->outlen)) {
+ if (field_avail(resp, clock_info_versions, udata->outlen)) {
if (mdev->clock_info)
resp.clock_info_versions = BIT(MLX5_IB_CLOCK_INFO_V1);
resp.response_length += sizeof(resp.clock_info_versions);
@@ -1930,7 +1929,7 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx,
* pretend we don't support reading the HCA's core clock. This is also
* forced by mmap function.
*/
- if (field_avail(typeof(resp), hca_core_clock_offset, udata->outlen)) {
+ if (field_avail(resp, hca_core_clock_offset, udata->outlen)) {
if (PAGE_SIZE <= 4096) {
resp.comp_mask |=
MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_CORE_CLOCK_OFFSET;
@@ -1940,18 +1939,18 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx,
resp.response_length += sizeof(resp.hca_core_clock_offset);
}
- if (field_avail(typeof(resp), log_uar_size, udata->outlen))
+ if (field_avail(resp, log_uar_size, udata->outlen))
resp.response_length += sizeof(resp.log_uar_size);
- if (field_avail(typeof(resp), num_uars_per_page, udata->outlen))
+ if (field_avail(resp, num_uars_per_page, udata->outlen))
resp.response_length += sizeof(resp.num_uars_per_page);
- if (field_avail(typeof(resp), num_dyn_bfregs, udata->outlen)) {
+ if (field_avail(resp, num_dyn_bfregs, udata->outlen)) {
resp.num_dyn_bfregs = bfregi->num_dyn_bfregs;
resp.response_length += sizeof(resp.num_dyn_bfregs);
}
- if (field_avail(typeof(resp), dump_fill_mkey, udata->outlen)) {
+ if (field_avail(resp, dump_fill_mkey, udata->outlen)) {
if (MLX5_CAP_GEN(dev->mdev, dump_fill_mkey)) {
resp.dump_fill_mkey = dump_fill_mkey;
resp.comp_mask |=
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 3976071a5dc9..347e7dfa6060 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -64,8 +64,6 @@
dev_warn(&(_dev)->ib_dev.dev, "%s:%d:(pid %d): " format, __func__, \
__LINE__, current->pid, ##arg)
-#define field_avail(type, fld, sz) (offsetof(type, fld) + \
- sizeof(((type *)0)->fld) <= (sz))
#define MLX5_IB_DEFAULT_UIDX 0xffffff
#define MLX5_USER_ASSIGNED_UIDX_MASK __mlx5_mask(qpc, user_index)
@@ -1471,13 +1469,13 @@ static inline int get_qp_user_index(struct mlx5_ib_ucontext *ucontext,
u32 *user_index)
{
u8 cqe_version = ucontext->cqe_version;
+ struct mlx5_ib_create_qp qp;
- if (field_avail(struct mlx5_ib_create_qp, uidx, inlen) &&
- !cqe_version && (ucmd->uidx == MLX5_IB_DEFAULT_UIDX))
+ if (field_avail(qp, uidx, inlen) && !cqe_version &&
+ (ucmd->uidx == MLX5_IB_DEFAULT_UIDX))
return 0;
- if (!!(field_avail(struct mlx5_ib_create_qp, uidx, inlen) !=
- !!cqe_version))
+ if (field_avail(qp, uidx, inlen) != !!cqe_version)
return -EINVAL;
return verify_assign_uidx(cqe_version, ucmd->uidx, user_index);
@@ -1489,13 +1487,13 @@ static inline int get_srq_user_index(struct mlx5_ib_ucontext *ucontext,
u32 *user_index)
{
u8 cqe_version = ucontext->cqe_version;
+ struct mlx5_ib_create_srq srq;
- if (field_avail(struct mlx5_ib_create_srq, uidx, inlen) &&
- !cqe_version && (ucmd->uidx == MLX5_IB_DEFAULT_UIDX))
+ if (field_avail(srq, uidx, inlen) && !cqe_version &&
+ (ucmd->uidx == MLX5_IB_DEFAULT_UIDX))
return 0;
- if (!!(field_avail(struct mlx5_ib_create_srq, uidx, inlen) !=
- !!cqe_version))
+ if (field_avail(srq, uidx, inlen) != !!cqe_version)
return -EINVAL;
return verify_assign_uidx(cqe_version, ucmd->uidx, user_index);
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 0d9db2a14f44..699648837750 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -79,6 +79,24 @@
*/
#define round_down(x, y) ((x) & ~__round_mask(x, y))
+/**
+ * FIELD_SIZEOF - get the size of a struct's field
+ * @t: the target struct
+ * @f: the target struct's field
+ * Return: the size of @f in the struct definition without having a
+ * declared instance of @t.
+ */
+#define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f))
+
+/**
+ * field_avail - check if specific field exists in provided data
+ * @x: the source data, usually struct received from the user
+ * @fld: field name
+ * @sz: size of the data
+ */
+#define field_avail(x, fld, sz) \
+ (offsetof(typeof(x), fld) + FIELD_SIZEOF(typeof(x), fld) <= (sz))
+
#define typeof_member(T, m) typeof(((T*)0)->m)
#define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 3/9] RDMA/cm: Delete not implemented CM peer to peer communication
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 1/9] RDMA/cm: Add Enhanced Connection Establishment (ECE) bits Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 2/9] RDMA: Promote field_avail() macro to be general code Leon Romanovsky
@ 2020-03-05 15:00 ` Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 4/9] RDMA/uapi: Add ECE definitions to UCMA Leon Romanovsky
` (5 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:00 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
Peer to peer support was never implemented, so delete it to make
code less clutter.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
---
drivers/infiniband/core/cm.c | 7 -------
include/rdma/ib_cm.h | 1 -
2 files changed, 8 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index aec6867f0ed2..77190704e81b 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -261,7 +261,6 @@ struct cm_id_private {
__be16 pkey;
u8 private_data_len;
u8 max_cm_retries;
- u8 peer_to_peer;
u8 responder_resources;
u8 initiator_depth;
u8 retry_count;
@@ -1380,10 +1379,6 @@ static void cm_format_req(struct cm_req_msg *req_msg,
static int cm_validate_req_param(struct ib_cm_req_param *param)
{
- /* peer-to-peer not supported */
- if (param->peer_to_peer)
- return -EINVAL;
-
if (!param->primary_path)
return -EINVAL;
@@ -2436,8 +2431,6 @@ static int cm_rep_handler(struct cm_work *work)
cm_ack_timeout(cm_id_priv->target_ack_delay,
cm_id_priv->alt_av.timeout - 1);
- /* todo: handle peer_to_peer */
-
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
ret = atomic_inc_and_test(&cm_id_priv->work_count);
if (!ret)
diff --git a/include/rdma/ib_cm.h b/include/rdma/ib_cm.h
index 8ec482e391aa..058cfbc2b37f 100644
--- a/include/rdma/ib_cm.h
+++ b/include/rdma/ib_cm.h
@@ -360,7 +360,6 @@ struct ib_cm_req_param {
u32 starting_psn;
const void *private_data;
u8 private_data_len;
- u8 peer_to_peer;
u8 responder_resources;
u8 initiator_depth;
u8 remote_cm_response_timeout;
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 4/9] RDMA/uapi: Add ECE definitions to UCMA
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
` (2 preceding siblings ...)
2020-03-05 15:00 ` [PATCH rdma-next 3/9] RDMA/cm: Delete not implemented CM peer to peer communication Leon Romanovsky
@ 2020-03-05 15:01 ` Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 5/9] RDMA/ucma: Extend ucma_connect to receive ECE parameters Leon Romanovsky
` (4 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:01 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
ECE parameters are used to perform handshake between different
CMID nodes in order to allow extra connection setup supported
by those two nodes. The data is provided by rdma_connect()
for the client and rdma_get_events() for the server.
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
include/uapi/rdma/rdma_user_cm.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index e42940a215a3..150b3f075f99 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -206,10 +206,16 @@ struct rdma_ucm_ud_param {
__u8 reserved[7];
};
+struct rdma_ucm_ece {
+ __u32 vendor_id;
+ __u32 attr_mod;
+};
+
struct rdma_ucm_connect {
struct rdma_ucm_conn_param conn_param;
__u32 id;
__u32 reserved;
+ struct rdma_ucm_ece ece;
};
struct rdma_ucm_listen {
@@ -287,6 +293,7 @@ struct rdma_ucm_event_resp {
struct rdma_ucm_ud_param ud;
} param;
__u32 reserved;
+ struct rdma_ucm_ece ece;
};
/* Option levels */
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 5/9] RDMA/ucma: Extend ucma_connect to receive ECE parameters
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
` (3 preceding siblings ...)
2020-03-05 15:01 ` [PATCH rdma-next 4/9] RDMA/uapi: Add ECE definitions to UCMA Leon Romanovsky
@ 2020-03-05 15:01 ` Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 6/9] RDMA/ucma: Deliver ECE parameters through UCMA events Leon Romanovsky
` (3 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:01 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
Active side of CMID initiates connection through librdmacm's rdma_connect()
and kernel's ucma_connect(). Extend UCMA interface to handle those new
parameters.
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
drivers/infiniband/core/cma.c | 13 +++++++++++++
drivers/infiniband/core/cma_priv.h | 1 +
drivers/infiniband/core/ucma.c | 14 +++++++++++---
include/rdma/rdma_cm.h | 11 +++++++++++
4 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 5165158a7aaa..b16f74c7be10 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -4000,6 +4000,19 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
}
EXPORT_SYMBOL(rdma_connect);
+int rdma_connect_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
+ struct rdma_ucm_ece *ece)
+{
+ struct rdma_id_private *id_priv =
+ container_of(id, struct rdma_id_private, id);
+
+ id_priv->ece.vendor_id = ece->vendor_id;
+ id_priv->ece.attr_mod = ece->attr_mod;
+
+ return rdma_connect(id, conn_param);
+}
+EXPORT_SYMBOL(rdma_connect_ece);
+
static int cma_accept_ib(struct rdma_id_private *id_priv,
struct rdma_conn_param *conn_param)
{
diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h
index 5edcf44a9307..caece96ebcf5 100644
--- a/drivers/infiniband/core/cma_priv.h
+++ b/drivers/infiniband/core/cma_priv.h
@@ -95,6 +95,7 @@ struct rdma_id_private {
* Internal to RDMA/core, don't use in the drivers
*/
struct rdma_restrack_entry res;
+ struct rdma_ucm_ece ece;
};
#if IS_ENABLED(CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS)
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 16b6cf57fa85..c06394350fed 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1070,12 +1070,15 @@ static void ucma_copy_conn_param(struct rdma_cm_id *id,
static ssize_t ucma_connect(struct ucma_file *file, const char __user *inbuf,
int in_len, int out_len)
{
- struct rdma_ucm_connect cmd;
struct rdma_conn_param conn_param;
+ struct rdma_ucm_ece ece = {};
+ struct rdma_ucm_connect cmd;
struct ucma_context *ctx;
+ size_t in_size;
int ret;
- if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+ in_size = min_t(size_t, in_len, sizeof(cmd));
+ if (copy_from_user(&cmd, inbuf, in_size))
return -EFAULT;
if (!cmd.conn_param.valid)
@@ -1086,8 +1089,13 @@ static ssize_t ucma_connect(struct ucma_file *file, const char __user *inbuf,
return PTR_ERR(ctx);
ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
+ if (field_avail(cmd, ece, in_size)) {
+ ece.vendor_id = cmd.ece.vendor_id;
+ ece.attr_mod = cmd.ece.attr_mod;
+ }
+
mutex_lock(&ctx->mutex);
- ret = rdma_connect(ctx->cm_id, &conn_param);
+ ret = rdma_connect_ece(ctx->cm_id, &conn_param, &ece);
mutex_unlock(&ctx->mutex);
ucma_put_ctx(ctx);
return ret;
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 71f48cfdc24c..86a849214c84 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -264,6 +264,17 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
*/
int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param);
+/**
+ * rdma_connect_ece - Initiate an active connection request with ECE data.
+ * @id: Connection identifier to connect.
+ * @conn_param: Connection information used for connected QPs.
+ * @ece: ECE parameters
+ *
+ * See rdma_connect() explanation.
+ */
+int rdma_connect_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
+ struct rdma_ucm_ece *ece);
+
/**
* rdma_listen - This function is called by the passive side to
* listen for incoming connection requests.
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 6/9] RDMA/ucma: Deliver ECE parameters through UCMA events
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
` (4 preceding siblings ...)
2020-03-05 15:01 ` [PATCH rdma-next 5/9] RDMA/ucma: Extend ucma_connect to receive ECE parameters Leon Romanovsky
@ 2020-03-05 15:01 ` Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 7/9] RDMA/cm: Send and receive ECE parameter over the wire Leon Romanovsky
` (2 subsequent siblings)
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:01 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
Passive side of CMID connection receives ECE request through
REQ message and needs to respond with relevant REP message which
will be forwarded to active side.
The UCMA events interface is responsible for such communication
with the user space (librdmacm). Extend it to provide ECE wire
data.
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
drivers/infiniband/core/ucma.c | 6 +++++-
include/rdma/rdma_cm.h | 1 +
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index c06394350fed..d2aeb23e6a3c 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -360,6 +360,9 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id,
ucma_copy_conn_event(&uevent->resp.param.conn,
&event->param.conn);
+ uevent->resp.ece.vendor_id = event->ece.vendor_id;
+ uevent->resp.ece.attr_mod = event->ece.attr_mod;
+
if (event->event == RDMA_CM_EVENT_CONNECT_REQUEST) {
if (!ctx->backlog) {
ret = -ENOMEM;
@@ -404,7 +407,8 @@ static ssize_t ucma_get_event(struct ucma_file *file, const char __user *inbuf,
* Old 32 bit user space does not send the 4 byte padding in the
* reserved field. We don't care, allow it to keep working.
*/
- if (out_len < sizeof(uevent->resp) - sizeof(uevent->resp.reserved))
+ if (out_len < sizeof(uevent->resp) - sizeof(uevent->resp.reserved) -
+ sizeof(uevent->resp.ece))
return -ENOSPC;
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 86a849214c84..761168c41848 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -111,6 +111,7 @@ struct rdma_cm_event {
struct rdma_conn_param conn;
struct rdma_ud_param ud;
} param;
+ struct rdma_ucm_ece ece;
};
struct rdma_cm_id;
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 7/9] RDMA/cm: Send and receive ECE parameter over the wire
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
` (5 preceding siblings ...)
2020-03-05 15:01 ` [PATCH rdma-next 6/9] RDMA/ucma: Deliver ECE parameters through UCMA events Leon Romanovsky
@ 2020-03-05 15:01 ` Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 8/9] RDMA/cma: Connect ECE to rdma_accept Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 9/9] RDMA/cma: Provide ECE reject reason Leon Romanovsky
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:01 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
ECE parameters are exchanged through REQ->REP/SIDR_REP messages,
this patch adds the data to provide to other side of CMID communication
channel.
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
drivers/infiniband/core/cm.c | 41 +++++++++++++++++++++++++++++-----
drivers/infiniband/core/cma.c | 8 +++++++
drivers/infiniband/core/ucma.c | 1 -
include/rdma/ib_cm.h | 9 +++++++-
4 files changed, 52 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 77190704e81b..9d226f228606 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -66,6 +66,8 @@ static const char * const ibcm_rej_reason_strs[] = {
[IB_CM_REJ_INVALID_CLASS_VERSION] = "invalid class version",
[IB_CM_REJ_INVALID_FLOW_LABEL] = "invalid flow label",
[IB_CM_REJ_INVALID_ALT_FLOW_LABEL] = "invalid alt flow label",
+ [IB_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED] =
+ "vendor option is not supported",
};
const char *__attribute_const__ ibcm_reject_msg(int reason)
@@ -276,6 +278,8 @@ struct cm_id_private {
struct list_head work_list;
atomic_t work_count;
+
+ struct rdma_ucm_ece ece;
};
static void cm_work_handler(struct work_struct *work);
@@ -1235,6 +1239,13 @@ static void cm_format_mad_hdr(struct ib_mad_hdr *hdr,
hdr->tid = tid;
}
+static void cm_format_mad_ece_hdr(struct ib_mad_hdr *hdr, __be16 attr_id,
+ __be64 tid, u32 attr_mod)
+{
+ cm_format_mad_hdr(hdr, attr_id, tid);
+ hdr->attr_mod = cpu_to_be32(attr_mod);
+}
+
static void cm_format_req(struct cm_req_msg *req_msg,
struct cm_id_private *cm_id_priv,
struct ib_cm_req_param *param)
@@ -1247,8 +1258,8 @@ static void cm_format_req(struct cm_req_msg *req_msg,
pri_ext = opa_is_extended_lid(pri_path->opa.dlid,
pri_path->opa.slid);
- cm_format_mad_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
- cm_form_tid(cm_id_priv));
+ cm_format_mad_ece_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
+ cm_form_tid(cm_id_priv), param->ece.attr_mod);
IBA_SET(CM_REQ_LOCAL_COMM_ID, req_msg,
be32_to_cpu(cm_id_priv->id.local_id));
@@ -1371,6 +1382,7 @@ static void cm_format_req(struct cm_req_msg *req_msg,
cm_ack_timeout(cm_id_priv->av.port->cm_dev->ack_delay,
alt_path->packet_life_time));
}
+ IBA_SET(CM_REQ_VENDOR_ID, req_msg, param->ece.vendor_id);
if (param->private_data && param->private_data_len)
IBA_SET_MEM(CM_REQ_PRIVATE_DATA, req_msg, param->private_data,
@@ -1728,6 +1740,9 @@ static void cm_format_req_event(struct cm_work *work,
param->rnr_retry_count = IBA_GET(CM_REQ_RNR_RETRY_COUNT, req_msg);
param->srq = IBA_GET(CM_REQ_SRQ, req_msg);
param->ppath_sgid_attr = cm_id_priv->av.ah_attr.grh.sgid_attr;
+ param->ece.vendor_id = IBA_GET(CM_REQ_VENDOR_ID, req_msg);
+ param->ece.attr_mod = be32_to_cpu(req_msg->hdr.attr_mod);
+
work->cm_event.private_data =
IBA_GET_MEM_PTR(CM_REQ_PRIVATE_DATA, req_msg);
}
@@ -2107,7 +2122,8 @@ static void cm_format_rep(struct cm_rep_msg *rep_msg,
struct cm_id_private *cm_id_priv,
struct ib_cm_rep_param *param)
{
- cm_format_mad_hdr(&rep_msg->hdr, CM_REP_ATTR_ID, cm_id_priv->tid);
+ cm_format_mad_ece_hdr(&rep_msg->hdr, CM_REP_ATTR_ID, cm_id_priv->tid,
+ param->ece.attr_mod);
IBA_SET(CM_REP_LOCAL_COMM_ID, rep_msg,
be32_to_cpu(cm_id_priv->id.local_id));
IBA_SET(CM_REP_REMOTE_COMM_ID, rep_msg,
@@ -2134,6 +2150,12 @@ static void cm_format_rep(struct cm_rep_msg *rep_msg,
IBA_SET(CM_REP_LOCAL_EE_CONTEXT_NUMBER, rep_msg, param->qp_num);
}
+ IBA_SET(CM_REP_VENDOR_ID_L, rep_msg, param->ece.vendor_id & 0xFF);
+ IBA_SET(CM_REP_VENDOR_ID_M, rep_msg,
+ (param->ece.vendor_id >> 8) & 0xFF);
+ IBA_SET(CM_REP_VENDOR_ID_H, rep_msg,
+ (param->ece.vendor_id >> 16) & 0xFF);
+
if (param->private_data && param->private_data_len)
IBA_SET_MEM(CM_REP_PRIVATE_DATA, rep_msg, param->private_data,
param->private_data_len);
@@ -2281,6 +2303,11 @@ static void cm_format_rep_event(struct cm_work *work, enum ib_qp_type qp_type)
param->flow_control = IBA_GET(CM_REP_END_TO_END_FLOW_CONTROL, rep_msg);
param->rnr_retry_count = IBA_GET(CM_REP_RNR_RETRY_COUNT, rep_msg);
param->srq = IBA_GET(CM_REP_SRQ, rep_msg);
+ param->ece.vendor_id = IBA_GET(CM_REP_VENDOR_ID_H, rep_msg) << 16;
+ param->ece.vendor_id |= IBA_GET(CM_REP_VENDOR_ID_M, rep_msg) << 8;
+ param->ece.vendor_id |= IBA_GET(CM_REP_VENDOR_ID_L, rep_msg);
+ param->ece.attr_mod = be32_to_cpu(rep_msg->hdr.attr_mod);
+
work->cm_event.private_data =
IBA_GET_MEM_PTR(CM_REP_PRIVATE_DATA, rep_msg);
}
@@ -3565,8 +3592,8 @@ static void cm_format_sidr_rep(struct cm_sidr_rep_msg *sidr_rep_msg,
struct cm_id_private *cm_id_priv,
struct ib_cm_sidr_rep_param *param)
{
- cm_format_mad_hdr(&sidr_rep_msg->hdr, CM_SIDR_REP_ATTR_ID,
- cm_id_priv->tid);
+ cm_format_mad_ece_hdr(&sidr_rep_msg->hdr, CM_SIDR_REP_ATTR_ID,
+ cm_id_priv->tid, param->ece.attr_mod);
IBA_SET(CM_SIDR_REP_REQUESTID, sidr_rep_msg,
be32_to_cpu(cm_id_priv->id.remote_id));
IBA_SET(CM_SIDR_REP_STATUS, sidr_rep_msg, param->status);
@@ -3574,6 +3601,10 @@ static void cm_format_sidr_rep(struct cm_sidr_rep_msg *sidr_rep_msg,
IBA_SET(CM_SIDR_REP_SERVICEID, sidr_rep_msg,
be64_to_cpu(cm_id_priv->id.service_id));
IBA_SET(CM_SIDR_REP_Q_KEY, sidr_rep_msg, param->qkey);
+ IBA_SET(CM_SIDR_REP_VENDOR_ID_L, sidr_rep_msg,
+ param->ece.vendor_id & 0xFF);
+ IBA_SET(CM_SIDR_REP_VENDOR_ID_H, sidr_rep_msg,
+ (param->ece.vendor_id >> 8) & 0xFF);
if (param->info && param->info_length)
IBA_SET_MEM(CM_SIDR_REP_ADDITIONAL_INFORMATION, sidr_rep_msg,
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index b16f74c7be10..e0c444f9f28d 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1906,6 +1906,9 @@ static void cma_set_rep_event_data(struct rdma_cm_event *event,
event->param.conn.rnr_retry_count = rep_data->rnr_retry_count;
event->param.conn.srq = rep_data->srq;
event->param.conn.qp_num = rep_data->remote_qpn;
+
+ event->ece.vendor_id = rep_data->ece.vendor_id;
+ event->ece.attr_mod = rep_data->ece.attr_mod;
}
static int cma_cm_event_handler(struct rdma_id_private *id_priv,
@@ -2124,6 +2127,9 @@ static void cma_set_req_event_data(struct rdma_cm_event *event,
event->param.conn.rnr_retry_count = req_data->rnr_retry_count;
event->param.conn.srq = req_data->srq;
event->param.conn.qp_num = req_data->remote_qpn;
+
+ event->ece.vendor_id = req_data->ece.vendor_id;
+ event->ece.attr_mod = req_data->ece.attr_mod;
}
static int cma_ib_check_req_qp_type(const struct rdma_cm_id *id,
@@ -3911,6 +3917,8 @@ static int cma_connect_ib(struct rdma_id_private *id_priv,
req.local_cm_response_timeout = CMA_CM_RESPONSE_TIMEOUT;
req.max_cm_retries = CMA_MAX_CM_RETRIES;
req.srq = id_priv->srq ? 1 : 0;
+ req.ece.vendor_id = id_priv->ece.vendor_id;
+ req.ece.attr_mod = id_priv->ece.attr_mod;
trace_cm_send_req(id_priv);
ret = ib_send_cm_req(id_priv->cm_id.ib, &req);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index d2aeb23e6a3c..3a54fd9d941f 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -362,7 +362,6 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id,
uevent->resp.ece.vendor_id = event->ece.vendor_id;
uevent->resp.ece.attr_mod = event->ece.attr_mod;
-
if (event->event == RDMA_CM_EVENT_CONNECT_REQUEST) {
if (!ctx->backlog) {
ret = -ENOMEM;
diff --git a/include/rdma/ib_cm.h b/include/rdma/ib_cm.h
index 058cfbc2b37f..0f1ea5f2d01c 100644
--- a/include/rdma/ib_cm.h
+++ b/include/rdma/ib_cm.h
@@ -11,6 +11,7 @@
#include <rdma/ib_mad.h>
#include <rdma/ib_sa.h>
+#include <rdma/rdma_cm.h>
/* ib_cm and ib_user_cm modules share /sys/class/infiniband_cm */
extern struct class cm_class;
@@ -115,6 +116,7 @@ struct ib_cm_req_event_param {
unsigned int retry_count:3;
unsigned int rnr_retry_count:3;
unsigned int srq:1;
+ struct rdma_ucm_ece ece;
};
struct ib_cm_rep_event_param {
@@ -129,6 +131,7 @@ struct ib_cm_rep_event_param {
unsigned int flow_control:1;
unsigned int rnr_retry_count:3;
unsigned int srq:1;
+ struct rdma_ucm_ece ece;
};
enum ib_cm_rej_reason {
@@ -164,7 +167,8 @@ enum ib_cm_rej_reason {
IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID = 30,
IB_CM_REJ_INVALID_CLASS_VERSION = 31,
IB_CM_REJ_INVALID_FLOW_LABEL = 32,
- IB_CM_REJ_INVALID_ALT_FLOW_LABEL = 33
+ IB_CM_REJ_INVALID_ALT_FLOW_LABEL = 33,
+ IB_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED = 35,
};
struct ib_cm_rej_event_param {
@@ -369,6 +373,7 @@ struct ib_cm_req_param {
u8 rnr_retry_count;
u8 max_cm_retries;
u8 srq;
+ struct rdma_ucm_ece ece;
};
/**
@@ -392,6 +397,7 @@ struct ib_cm_rep_param {
u8 flow_control;
u8 rnr_retry_count;
u8 srq;
+ struct rdma_ucm_ece ece;
};
/**
@@ -546,6 +552,7 @@ struct ib_cm_sidr_rep_param {
u8 info_length;
const void *private_data;
u8 private_data_len;
+ struct rdma_ucm_ece ece;
};
/**
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 8/9] RDMA/cma: Connect ECE to rdma_accept
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
` (6 preceding siblings ...)
2020-03-05 15:01 ` [PATCH rdma-next 7/9] RDMA/cm: Send and receive ECE parameter over the wire Leon Romanovsky
@ 2020-03-05 15:01 ` Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 9/9] RDMA/cma: Provide ECE reject reason Leon Romanovsky
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:01 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
The rdma_accept() is called by both passive and active
sides of CMID connection to mark readiness to start data
transfer. For passive side, this is called explicitly,
for active side, it is called implicitly while receiving
REP message.
Provide ECE data to rdma_accept function needed for passive
side to send that REP message.
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
drivers/infiniband/core/cma.c | 19 +++++++++++++++++++
drivers/infiniband/core/ucma.c | 14 +++++++++++---
include/rdma/rdma_cm.h | 3 +++
include/uapi/rdma/rdma_user_cm.h | 1 +
4 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index e0c444f9f28d..f1f0d51667b7 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -4046,6 +4046,8 @@ static int cma_accept_ib(struct rdma_id_private *id_priv,
rep.flow_control = conn_param->flow_control;
rep.rnr_retry_count = min_t(u8, 7, conn_param->rnr_retry_count);
rep.srq = id_priv->srq ? 1 : 0;
+ rep.ece.vendor_id = id_priv->ece.vendor_id;
+ rep.ece.attr_mod = id_priv->ece.attr_mod;
trace_cm_send_rep(id_priv);
ret = ib_send_cm_rep(id_priv->cm_id.ib, &rep);
@@ -4093,7 +4095,11 @@ static int cma_send_sidr_rep(struct rdma_id_private *id_priv,
return ret;
rep.qp_num = id_priv->qp_num;
rep.qkey = id_priv->qkey;
+
+ rep.ece.vendor_id = id_priv->ece.vendor_id;
+ rep.ece.attr_mod = id_priv->ece.attr_mod;
}
+
rep.private_data = private_data;
rep.private_data_len = private_data_len;
@@ -4151,6 +4157,19 @@ int __rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
}
EXPORT_SYMBOL(__rdma_accept);
+int __rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
+ const char *caller, struct rdma_ucm_ece *ece)
+{
+ struct rdma_id_private *id_priv =
+ container_of(id, struct rdma_id_private, id);
+
+ id_priv->ece.vendor_id = ece->vendor_id;
+ id_priv->ece.attr_mod = ece->attr_mod;
+
+ return __rdma_accept(id, conn_param, caller);
+}
+EXPORT_SYMBOL(__rdma_accept_ece);
+
int rdma_notify(struct rdma_cm_id *id, enum ib_event_type event)
{
struct rdma_id_private *id_priv;
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 3a54fd9d941f..135453f75733 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1132,28 +1132,36 @@ static ssize_t ucma_accept(struct ucma_file *file, const char __user *inbuf,
{
struct rdma_ucm_accept cmd;
struct rdma_conn_param conn_param;
+ struct rdma_ucm_ece ece = {};
struct ucma_context *ctx;
+ size_t in_size;
int ret;
- if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
+ in_size = min_t(size_t, in_len, sizeof(cmd));
+ if (copy_from_user(&cmd, inbuf, in_size))
return -EFAULT;
ctx = ucma_get_ctx_dev(file, cmd.id);
if (IS_ERR(ctx))
return PTR_ERR(ctx);
+ if (field_avail(cmd, ece, in_size)) {
+ ece.vendor_id = cmd.ece.vendor_id;
+ ece.attr_mod = cmd.ece.attr_mod;
+ }
+
if (cmd.conn_param.valid) {
ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
mutex_lock(&file->mut);
mutex_lock(&ctx->mutex);
- ret = __rdma_accept(ctx->cm_id, &conn_param, NULL);
+ ret = __rdma_accept_ece(ctx->cm_id, &conn_param, NULL, &ece);
mutex_unlock(&ctx->mutex);
if (!ret)
ctx->uid = cmd.uid;
mutex_unlock(&file->mut);
} else {
mutex_lock(&ctx->mutex);
- ret = __rdma_accept(ctx->cm_id, NULL, NULL);
+ ret = __rdma_accept_ece(ctx->cm_id, NULL, NULL, &ece);
mutex_unlock(&ctx->mutex);
}
ucma_put_ctx(ctx);
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 761168c41848..8d961d8b7cdb 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -288,6 +288,9 @@ int rdma_listen(struct rdma_cm_id *id, int backlog);
int __rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
const char *caller);
+int __rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
+ const char *caller, struct rdma_ucm_ece *ece);
+
/**
* rdma_accept - Called to accept a connection request or response.
* @id: Connection identifier associated with the request.
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index 150b3f075f99..c4ca1412bcf9 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -228,6 +228,7 @@ struct rdma_ucm_accept {
struct rdma_ucm_conn_param conn_param;
__u32 id;
__u32 reserved;
+ struct rdma_ucm_ece ece;
};
struct rdma_ucm_reject {
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH rdma-next 9/9] RDMA/cma: Provide ECE reject reason
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
` (7 preceding siblings ...)
2020-03-05 15:01 ` [PATCH rdma-next 8/9] RDMA/cma: Connect ECE to rdma_accept Leon Romanovsky
@ 2020-03-05 15:01 ` Leon Romanovsky
8 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-05 15:01 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Mark Zhang
From: Leon Romanovsky <leonro@mellanox.com>
IBTA declares "vendor option not supported" reject reason in REJ
messages if passive side doesn't want to accept proposed ECE options.
Due to the fact that ECE is managed by userspace, there is a need to let
users to provide such rejected reason.
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
drivers/infiniband/core/cma.c | 14 ++++++++------
drivers/infiniband/core/ucma.c | 7 ++++++-
include/rdma/ib_cm.h | 3 ++-
include/rdma/rdma_cm.h | 13 ++++++++++---
include/uapi/rdma/rdma_user_cm.h | 7 ++++++-
5 files changed, 32 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index f1f0d51667b7..0b57c15139cf 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -4191,8 +4191,8 @@ int rdma_notify(struct rdma_cm_id *id, enum ib_event_type event)
}
EXPORT_SYMBOL(rdma_notify);
-int rdma_reject(struct rdma_cm_id *id, const void *private_data,
- u8 private_data_len)
+int rdma_reject_ece(struct rdma_cm_id *id, const void *private_data,
+ u8 private_data_len, enum rdma_ucm_reject_reason reason)
{
struct rdma_id_private *id_priv;
int ret;
@@ -4206,10 +4206,12 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
private_data, private_data_len);
} else {
+ enum ib_cm_rej_reason r =
+ (reason) ?: IB_CM_REJ_CONSUMER_DEFINED;
+
trace_cm_send_rej(id_priv);
- ret = ib_send_cm_rej(id_priv->cm_id.ib,
- IB_CM_REJ_CONSUMER_DEFINED, NULL,
- 0, private_data, private_data_len);
+ ret = ib_send_cm_rej(id_priv->cm_id.ib, r, NULL, 0,
+ private_data, private_data_len);
}
} else if (rdma_cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_reject(id_priv->cm_id.iw,
@@ -4219,7 +4221,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
return ret;
}
-EXPORT_SYMBOL(rdma_reject);
+EXPORT_SYMBOL(rdma_reject_ece);
int rdma_disconnect(struct rdma_cm_id *id)
{
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 135453f75733..1b5fd3020bcb 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1178,12 +1178,17 @@ static ssize_t ucma_reject(struct ucma_file *file, const char __user *inbuf,
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
return -EFAULT;
+ if (cmd.reason &&
+ cmd.reason != RDMA_USER_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED)
+ return -EINVAL;
+
ctx = ucma_get_ctx_dev(file, cmd.id);
if (IS_ERR(ctx))
return PTR_ERR(ctx);
mutex_lock(&ctx->mutex);
- ret = rdma_reject(ctx->cm_id, cmd.private_data, cmd.private_data_len);
+ ret = rdma_reject_ece(ctx->cm_id, cmd.private_data,
+ cmd.private_data_len, cmd.reason);
mutex_unlock(&ctx->mutex);
ucma_put_ctx(ctx);
return ret;
diff --git a/include/rdma/ib_cm.h b/include/rdma/ib_cm.h
index 0f1ea5f2d01c..ed328a99ed0a 100644
--- a/include/rdma/ib_cm.h
+++ b/include/rdma/ib_cm.h
@@ -168,7 +168,8 @@ enum ib_cm_rej_reason {
IB_CM_REJ_INVALID_CLASS_VERSION = 31,
IB_CM_REJ_INVALID_FLOW_LABEL = 32,
IB_CM_REJ_INVALID_ALT_FLOW_LABEL = 33,
- IB_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED = 35,
+ IB_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED =
+ RDMA_USER_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED,
};
struct ib_cm_rej_event_param {
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 8d961d8b7cdb..56d85d30e55d 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -324,11 +324,18 @@ int __rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
*/
int rdma_notify(struct rdma_cm_id *id, enum ib_event_type event);
+
/**
- * rdma_reject - Called to reject a connection request or response.
+ * rdma_reject_ece - Called to reject a connection request or response.
*/
-int rdma_reject(struct rdma_cm_id *id, const void *private_data,
- u8 private_data_len);
+int rdma_reject_ece(struct rdma_cm_id *id, const void *private_data,
+ u8 private_data_len, enum rdma_ucm_reject_reason reason);
+
+static inline int rdma_reject(struct rdma_cm_id *id, const void *private_data,
+ u8 private_data_len)
+{
+ return rdma_reject_ece(id, private_data, private_data_len, 0);
+}
/**
* rdma_disconnect - This function disconnects the associated QP and
diff --git a/include/uapi/rdma/rdma_user_cm.h b/include/uapi/rdma/rdma_user_cm.h
index c4ca1412bcf9..e545f2de1e13 100644
--- a/include/uapi/rdma/rdma_user_cm.h
+++ b/include/uapi/rdma/rdma_user_cm.h
@@ -78,6 +78,10 @@ enum rdma_ucm_port_space {
RDMA_PS_UDP = 0x0111,
};
+enum rdma_ucm_reject_reason {
+ RDMA_USER_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED = 35
+};
+
/*
* command ABI structures.
*/
@@ -234,7 +238,8 @@ struct rdma_ucm_accept {
struct rdma_ucm_reject {
__u32 id;
__u8 private_data_len;
- __u8 reserved[3];
+ __u8 reason; /* enum rdma_ucm_reject_reason */
+ __u8 reserved[2];
__u8 private_data[RDMA_MAX_PRIVATE_DATA];
};
--
2.24.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH rdma-next 2/9] RDMA: Promote field_avail() macro to be general code
2020-03-05 15:00 ` [PATCH rdma-next 2/9] RDMA: Promote field_avail() macro to be general code Leon Romanovsky
@ 2020-03-05 15:18 ` Jason Gunthorpe
2020-03-08 7:18 ` Leon Romanovsky
0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2020-03-05 15:18 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Doug Ledford, Leon Romanovsky, Gal Pressman, linux-rdma,
Mark Zhang, Yishai Hadas
On Thu, Mar 05, 2020 at 05:00:58PM +0200, Leon Romanovsky wrote:
>
> +/**
> + * FIELD_SIZEOF - get the size of a struct's field
> + * @t: the target struct
> + * @f: the target struct's field
> + * Return: the size of @f in the struct definition without having a
> + * declared instance of @t.
> + */
> +#define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f))
This is already sizeof_field
> +/**
> + * field_avail - check if specific field exists in provided data
> + * @x: the source data, usually struct received from the user
> + * @fld: field name
> + * @sz: size of the data
> + */
> +#define field_avail(x, fld, sz) \
> + (offsetof(typeof(x), fld) + FIELD_SIZEOF(typeof(x), fld) <= (sz))
This is just offsetofend, I'm not sure there is such a reason to even
have this field_avail macro really..
Jason
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH rdma-next 2/9] RDMA: Promote field_avail() macro to be general code
2020-03-05 15:18 ` Jason Gunthorpe
@ 2020-03-08 7:18 ` Leon Romanovsky
0 siblings, 0 replies; 12+ messages in thread
From: Leon Romanovsky @ 2020-03-08 7:18 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Doug Ledford, Gal Pressman, linux-rdma, Mark Zhang, Yishai Hadas
On Thu, Mar 05, 2020 at 11:18:50AM -0400, Jason Gunthorpe wrote:
> On Thu, Mar 05, 2020 at 05:00:58PM +0200, Leon Romanovsky wrote:
> >
> > +/**
> > + * FIELD_SIZEOF - get the size of a struct's field
> > + * @t: the target struct
> > + * @f: the target struct's field
> > + * Return: the size of @f in the struct definition without having a
> > + * declared instance of @t.
> > + */
> > +#define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f))
>
> This is already sizeof_field
Ohh, thanks
>
> > +/**
> > + * field_avail - check if specific field exists in provided data
> > + * @x: the source data, usually struct received from the user
> > + * @fld: field name
> > + * @sz: size of the data
> > + */
> > +#define field_avail(x, fld, sz) \
> > + (offsetof(typeof(x), fld) + FIELD_SIZEOF(typeof(x), fld) <= (sz))
>
> This is just offsetofend, I'm not sure there is such a reason to even
> have this field_avail macro really..
Even better.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-03-08 7:19 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-05 15:00 [PATCH rdma-next 0/9] Add Enhanced Connection Established (ECE) Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 1/9] RDMA/cm: Add Enhanced Connection Establishment (ECE) bits Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 2/9] RDMA: Promote field_avail() macro to be general code Leon Romanovsky
2020-03-05 15:18 ` Jason Gunthorpe
2020-03-08 7:18 ` Leon Romanovsky
2020-03-05 15:00 ` [PATCH rdma-next 3/9] RDMA/cm: Delete not implemented CM peer to peer communication Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 4/9] RDMA/uapi: Add ECE definitions to UCMA Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 5/9] RDMA/ucma: Extend ucma_connect to receive ECE parameters Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 6/9] RDMA/ucma: Deliver ECE parameters through UCMA events Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 7/9] RDMA/cm: Send and receive ECE parameter over the wire Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 8/9] RDMA/cma: Connect ECE to rdma_accept Leon Romanovsky
2020-03-05 15:01 ` [PATCH rdma-next 9/9] RDMA/cma: Provide ECE reject reason Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).