* [PATCH V1 libibverbs 0/5] Add cross-channel support
@ 2016-01-16 15:53 Leon Romanovsky
[not found] ` <1452959624-29454-1-git-send-email-leon-2ukJVAZIZ/Y@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Leon Romanovsky @ 2016-01-16 15:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky
From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
The following set of patches adds cross-channel (CC) support
in the libibverbs.
The cross-channel feature allows to execute WQEs that involve
cross-channel synchronization of IO operations’ on different QPs.
Complex applications usually require synchronizations for IO
operations from multiple sources before continuing their
execution.
In order to implement this, the host software usually needs
to handle completions from each one of the receive queues
(arriving in arbitrary order), process the data after last
message arrival and only then post work request on the send
queue to send the combined data to its destination.
Execution of such operation generates multiple interrupts at
an unpredictable time with an overhead of interrupts handling
and context switches.
This patchset adds synchronization primitives which gives
ability to perform conditional operations and a following
submission will introduce arithmetic calculation offload.
Synchronization abilities combined with arithmetic calculations
will allow us to program complex scenarios with a single function
call, hereby significantly reducing overhead associated with IO
processing.
Patch #1 adds CQ ignore overrun flag.
Patch #2 implements QP flags to configure master/slave properties
of queue.
Patch #3 expands work request structure to have cross-channel
specific primitives.
Patch #4 exports cross-channel device capability flag.
Patch #5 provides an example of cross-channel synchronizations.
These patches were added on top of "Completion timestamping" [1]
and "Expose QP block self multicast loopback creation flag" [2]
series.
[1] https://www.mail-archive.com/linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg28895.html
[2] http://comments.gmane.org/gmane.linux.drivers.rdma/30158
Changes from v0:
* Enrich cover message and commit messages.
* Add ibv_cc_pingpong example.
* Add manual to all newly created flags.
Leon Romanovsky (5):
Add CQ ignore overrun flag
Add cross-channel QP flags
Add cross-channel work requests primitives
Export cross-channel capability flag
Add an example of cross-channel synchronization
Makefile.am | 6 +-
examples/.gitignore | 1 +
examples/cc_pingpong.c | 991 +++++++++++++++++++++++++++++++++++++++++++++
include/infiniband/verbs.h | 23 +-
man/ibv_cc_pingpong.1 | 66 +++
man/ibv_create_cq_ex.3 | 18 +-
man/ibv_create_qp_ex.3 | 18 +-
man/ibv_post_send.3 | 25 +-
src/cmd.c | 5 +-
9 files changed, 1141 insertions(+), 12 deletions(-)
create mode 100644 examples/cc_pingpong.c
create mode 100644 man/ibv_cc_pingpong.1
--
1.7.12.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH libibverbs V1 1/5] Add CQ ignore overrun flag
[not found] ` <1452959624-29454-1-git-send-email-leon-2ukJVAZIZ/Y@public.gmane.org>
@ 2016-01-16 15:53 ` Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 2/5] Add cross-channel QP flags Leon Romanovsky
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2016-01-16 15:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky
From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
A CQ overrun is checked while posting a completion, and if
encountered, the QP is transferred to the appropriate error
state.
CQ update (and error discovery) are not synchronized with
WQE execution. In cross-channel mode, the send/receive
queues will forward their completions to managing QP.
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
include/infiniband/verbs.h | 1 +
man/ibv_create_cq_ex.3 | 18 ++++++++++++++----
2 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index c3e863850d4e..466d779592bf 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -1207,6 +1207,7 @@ enum ibv_create_cq_attr {
enum ibv_create_cq_attr_flags {
IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP = 1 << 0,
+ IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN = 1 << 1
};
struct ibv_create_cq_attr_ex {
diff --git a/man/ibv_create_cq_ex.3 b/man/ibv_create_cq_ex.3
index 9f9e049b0d43..f01a5513926b 100644
--- a/man/ibv_create_cq_ex.3
+++ b/man/ibv_create_cq_ex.3
@@ -1,6 +1,6 @@
.\" -*- nroff -*-
.\"
-.TH IBV_CREATE_CQ_EX 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
+.TH IBV_CREATE_CQ_EX 3 2015-12-27 libibverbs "Libibverbs Programmer's Manual"
.SH "NAME"
ibv_create_cq_ex \- create a completion queue (CQ)
.SH "SYNOPSIS"
@@ -42,13 +42,19 @@ enum ibv_wc_flags_ex {
IBV_WC_EX_WITH_SLID = 1 << 7, /* Require slid in WC */
IBV_WC_EX_WITH_SL = 1 << 8, /* Require sl in WC */
IBV_WC_EX_WITH_DLID_PATH_BITS = 1 << 9, /* Require dlid path bits in WC */
- IBV_WC_EX_WITH_COMPLETION_TIMESTAMP = 1 << 10, /* Require completion timestamp in WC /*
+ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP = 1 << 10, /* Require completion timestamp in WC */
};
+enum ibv_create_cq_attr_flags {
+ IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP = 1 << 0,
+ IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN = 1 << 1 /* Ignore completion queue overrun errors */
+};
enum ibv_create_cq_attr {
- IBV_CREATE_CQ_ATTR_FLAGS = 1 << 0,
+ IBV_CREATE_CQ_ATTR_FLAGS = 1 << 0,
+ IBV_CREATE_CQ_ATTR_RESERVED = 1 << 1
};
+
.SH "RETURN VALUE"
.B ibv_create_cq_ex()
returns a pointer to the CQ, or NULL if the request fails.
@@ -68,4 +74,8 @@ CQ should be destroyed with ibv_destroy_cq.
.BR ibv_create_qp (3)
.SH "AUTHORS"
.TP
-Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
+Matan Barak
+.RI < matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org >
+.TP
+Leon Romanovsky
+.RI < leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org >
--
1.7.12.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH libibverbs V1 2/5] Add cross-channel QP flags
[not found] ` <1452959624-29454-1-git-send-email-leon-2ukJVAZIZ/Y@public.gmane.org>
2016-01-16 15:53 ` [PATCH libibverbs V1 1/5] Add CQ ignore overrun flag Leon Romanovsky
@ 2016-01-16 15:53 ` Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 3/5] Add cross-channel work requests primitives Leon Romanovsky
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2016-01-16 15:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky
From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
The cross-channel feature allows to execute WQEs that involve
synchronization of IO operations’ on different QPs.
This capability enables to program complex flows with a single
function call, hereby significantly reducing overhead associated
with IO processing.
The queue pairs can be configured to work as a “sync master queue”
or “sync slave queues”.
QP property flags to indicate if queues are slave or master:
* IB_QP_CREATE_MANAGED_SEND indicates that posted send work requests
will not be executed immediately and requires enabling.
* IB_QP_CREATE_MANAGED_RECV indicates that posted receive work
requests will not be executed immediately and requires enabling.
* IB_QP_CREATE_CROSS_CHANNEL declares the QP to work in cross-channel
mode. If IB_QP_CREATE_MANAGED_SEND and IB_QP_CREATE_MANAGED_RECV are
not provided, this QP will be sync master queue, else it will be sync
slave.
* IBV_QP_CREATE_IGNORE_SQ_OVERFLOW configures QP to discard overflow
indications on send queue.
* IBV_QP_CREATE_IGNORE_RQ_OVERFLOW instructs QP to discard overflow
indications on receive queue.
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
include/infiniband/verbs.h | 5 +++++
man/ibv_create_qp_ex.3 | 18 ++++++++++++++++--
src/cmd.c | 5 ++++-
3 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index 466d779592bf..7e85de08bfc9 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -769,6 +769,11 @@ enum ibv_qp_init_attr_mask {
enum ibv_qp_create_flags {
IBV_QP_CREATE_BLOCK_SELF_MCAST_LB = 1 << 1,
+ IBV_QP_CREATE_CROSS_CHANNEL = 1 << 2,
+ IBV_QP_CREATE_MANAGED_SEND = 1 << 3,
+ IBV_QP_CREATE_MANAGED_RECV = 1 << 4,
+ IBV_QP_CREATE_IGNORE_SQ_OVERFLOW = 1 << 5,
+ IBV_QP_CREATE_IGNORE_RQ_OVERFLOW = 1 << 6
};
struct ibv_qp_init_attr_ex {
diff --git a/man/ibv_create_qp_ex.3 b/man/ibv_create_qp_ex.3
index f772a57b4c55..b80a586306c7 100644
--- a/man/ibv_create_qp_ex.3
+++ b/man/ibv_create_qp_ex.3
@@ -1,6 +1,6 @@
.\" -*- nroff -*-
.\"
-.TH IBV_CREATE_QP_EX 3 2013-06-26 libibverbs "Libibverbs Programmer's Manual"
+.TH IBV_CREATE_QP_EX 3 2015-12-27 libibverbs "Libibverbs Programmer's Manual"
.SH "NAME"
ibv_create_qp_ex, ibv_destroy_qp \- create or destroy a queue pair (QP)
.SH "SYNOPSIS"
@@ -47,6 +47,16 @@ uint32_t max_recv_sge; /* Requested max number of s/g elements
uint32_t max_inline_data;/* Requested max number of data (bytes) that can be posted inline to the SQ, otherwise 0 */
.in -8
};
+.sp
+.mf
+enum ibv_qp_create_flags {
+ IBV_QP_CREATE_BLOCK_SELF_MCAST_LB = 1 << 1,
+ IBV_QP_CREATE_CROSS_CHANNEL = 1 << 2, /* Set QP to work in cross-channel mode */
+ IBV_QP_CREATE_MANAGED_SEND = 1 << 3, /* Send work request posted to this QP won't be executed immediately and requires enabling /*
+ IBV_QP_CREATE_MANAGED_RECV = 1 << 4, /* Receive work request posted to this QP won't be executed immediately and requires enabling */
+ IBV_QP_CREATE_IGNORE_SQ_OVERFLOW = 1 << 5, /* Configure QP to discard overflow indications on send queue */
+ IBV_QP_CREATE_IGNORE_RQ_OVERFLOW = 1 << 6 /* Configure QP to discard overflow indications on receive queue */
+};
.fi
.PP
The function
@@ -80,4 +90,8 @@ fails if the QP is attached to a multicast group.
.BR ibv_query_qp (3)
.SH "AUTHORS"
.TP
-Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
+Yishai Hadas
+.RI < yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org >
+.TP
+Leon Romanovsky
+.RI < leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org >
diff --git a/src/cmd.c b/src/cmd.c
index 675777a8ee5a..ae33491befb4 100644
--- a/src/cmd.c
+++ b/src/cmd.c
@@ -815,7 +815,10 @@ static void create_qp_handle_resp_common(struct ibv_context *context,
}
enum {
- CREATE_QP_EX2_SUP_CREATE_FLAGS = IBV_QP_CREATE_BLOCK_SELF_MCAST_LB,
+ CREATE_QP_EX2_SUP_CREATE_FLAGS = IBV_QP_CREATE_BLOCK_SELF_MCAST_LB |
+ IBV_QP_CREATE_CROSS_CHANNEL |
+ IBV_QP_CREATE_MANAGED_SEND |
+ IBV_QP_CREATE_MANAGED_RECV
};
int ibv_cmd_create_qp_ex2(struct ibv_context *context,
--
1.7.12.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH libibverbs V1 3/5] Add cross-channel work requests primitives
[not found] ` <1452959624-29454-1-git-send-email-leon-2ukJVAZIZ/Y@public.gmane.org>
2016-01-16 15:53 ` [PATCH libibverbs V1 1/5] Add CQ ignore overrun flag Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 2/5] Add cross-channel QP flags Leon Romanovsky
@ 2016-01-16 15:53 ` Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 4/5] Export cross-channel capability flag Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 5/5] Add an example of cross-channel synchronization Leon Romanovsky
4 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2016-01-16 15:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky
From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
The cross-channel feature relies on special primitives to
send and receive work requests.
* CQE wait - This work request holds execution of subsequent
requests on that queue until this number of completions of
a CQ is met.
• WQE enable - This work request specifies value of work
requests on the controlled send/receive queue. It enables the
execution of all WQEs up to the work request which is
marked by IBV_SEND_WAIT_EN_LAST.
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
include/infiniband/verbs.h | 16 ++++++++++++++--
man/ibv_post_send.3 | 25 +++++++++++++++++++++++--
2 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index 7e85de08bfc9..d91dd8a1376e 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -884,7 +884,10 @@ enum ibv_wr_opcode {
IBV_WR_SEND_WITH_IMM,
IBV_WR_RDMA_READ,
IBV_WR_ATOMIC_CMP_AND_SWP,
- IBV_WR_ATOMIC_FETCH_AND_ADD
+ IBV_WR_ATOMIC_FETCH_AND_ADD,
+ IBV_WR_SEND_ENABLE,
+ IBV_WR_RECV_ENABLE,
+ IBV_WR_CQE_WAIT
};
enum ibv_send_flags {
@@ -892,7 +895,8 @@ enum ibv_send_flags {
IBV_SEND_SIGNALED = 1 << 1,
IBV_SEND_SOLICITED = 1 << 2,
IBV_SEND_INLINE = 1 << 3,
- IBV_SEND_IP_CSUM = 1 << 4
+ IBV_SEND_IP_CSUM = 1 << 4,
+ IBV_SEND_WAIT_EN_LAST = 1 << 5
};
struct ibv_sge {
@@ -925,6 +929,14 @@ struct ibv_send_wr {
uint32_t remote_qpn;
uint32_t remote_qkey;
} ud;
+ struct {
+ struct ibv_cq *cq;
+ int32_t cq_count;
+ } cqe_wait;
+ struct {
+ struct ibv_qp *qp;
+ int32_t wqe_count;
+ } wqe_enable;
} wr;
union {
struct {
diff --git a/man/ibv_post_send.3 b/man/ibv_post_send.3
index eeea0787fa80..28dd43bfa3d9 100644
--- a/man/ibv_post_send.3
+++ b/man/ibv_post_send.3
@@ -1,6 +1,6 @@
.\" -*- nroff -*-
.\"
-.TH IBV_POST_SEND 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual"
+.TH IBV_POST_SEND 3 2015-12-26 libibverbs "Libibverbs Programmer's Manual"
.SH "NAME"
ibv_post_send \- post a list of work requests (WRs) to a send queue
.SH "SYNOPSIS"
@@ -58,6 +58,18 @@ uint32_t remote_qpn; /* QP number of the destination QP */
uint32_t remote_qkey; /* Q_Key number of the destination QP */
.in -8
} ud;
+struct {
+.in +8
+struct ibv_cq *cq; /* CQ for the wait on */
+int32_t cq_count; /* Count of completions of a CQ to met */
+.in -8
+} cqe_wait;
+struct {
+.in +8
+struct ibv_qp *qp; /* QP for the release */
+int32_t wqe_count; /* Number of work requests to release */
+.in -8
+} wqe_enable;
.in -8
} wr;
.in -8
@@ -85,6 +97,9 @@ IBV_WR_RDMA_WRITE_WITH_IMM | | X | X
IBV_WR_RDMA_READ | | | X
IBV_WR_ATOMIC_CMP_AND_SWP | | | X
IBV_WR_ATOMIC_FETCH_AND_ADD | | | X
+IBV_WR_SEND_ENABLE | | | X
+IBV_WR_RECV_ENABLE | | | X
+IBV_WR_CQE_WAIT | | | X
.fi
.PP
The attribute send_flags describes the properties of the \s-1WR\s0. It is either 0 or the bitwise \s-1OR\s0 of one or more of the following flags:
@@ -102,6 +117,8 @@ in a send WQE. Valid only for Send and RDMA Write. The L_Key will not be check
.B IBV_SEND_IP_CSUM \fR Offload the IPv4 and TCP/UDP checksum calculation.
Valid only when \fBdevice_cap_flags\fR in device_attr indicates current QP is
supported by checksum offload.
+.TP
+.B IBV_SEND_WAIT_EN_LAST \fR Mark this work request as a last one in the cross-channel offloaded sequence.
.SH "RETURN VALUE"
.B ibv_post_send()
returns 0 on success, or the value of errno on failure (which indicates the failure reason).
@@ -124,4 +141,8 @@ after the call returns.
.BR ibv_poll_cq (3)
.SH "AUTHORS"
.TP
-Dotan Barak <dotanba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
+Dotan Barak
+.RI < dotanba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org >
+.TP
+Leon Romanovsky
+.RI < leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org >
--
1.7.12.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH libibverbs V1 4/5] Export cross-channel capability flag
[not found] ` <1452959624-29454-1-git-send-email-leon-2ukJVAZIZ/Y@public.gmane.org>
` (2 preceding siblings ...)
2016-01-16 15:53 ` [PATCH libibverbs V1 3/5] Add cross-channel work requests primitives Leon Romanovsky
@ 2016-01-16 15:53 ` Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 5/5] Add an example of cross-channel synchronization Leon Romanovsky
4 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2016-01-16 15:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky
From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Export device capability flag IB_DEVICE_CROSS_CHANNEL
for the devices that can perform cross-channel operations.
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
include/infiniband/verbs.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index d91dd8a1376e..0ad9be3c323f 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -121,6 +121,7 @@ enum ibv_device_cap_flags {
IBV_DEVICE_XRC = 1 << 20,
IBV_DEVICE_RC_IP_CSUM = 1 << 25,
IBV_DEVICE_RAW_IP_CSUM = 1 << 26,
+ IBV_DEVICE_CROSS_CHANNEL = 1 << 27,
IBV_DEVICE_MANAGED_FLOW_STEERING = 1 << 29
};
--
1.7.12.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH libibverbs V1 5/5] Add an example of cross-channel synchronization
[not found] ` <1452959624-29454-1-git-send-email-leon-2ukJVAZIZ/Y@public.gmane.org>
` (3 preceding siblings ...)
2016-01-16 15:53 ` [PATCH libibverbs V1 4/5] Export cross-channel capability flag Leon Romanovsky
@ 2016-01-16 15:53 ` Leon Romanovsky
4 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2016-01-16 15:53 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Leon Romanovsky
From: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Add ibv_cc_pingpong application as an example for use of
synchronization primitives to perform conditional flows.
Signed-off-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
Makefile.am | 6 +-
examples/.gitignore | 1 +
examples/cc_pingpong.c | 991 +++++++++++++++++++++++++++++++++++++++++++++++++
man/ibv_cc_pingpong.1 | 66 ++++
4 files changed, 1063 insertions(+), 1 deletion(-)
create mode 100644 examples/cc_pingpong.c
create mode 100644 man/ibv_cc_pingpong.1
diff --git a/Makefile.am b/Makefile.am
index b6399d6eada4..dfab9225f9ae 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -22,7 +22,8 @@ src_libibverbs_la_DEPENDENCIES = $(srcdir)/src/libibverbs.map
bin_PROGRAMS = examples/ibv_devices examples/ibv_devinfo \
examples/ibv_asyncwatch examples/ibv_rc_pingpong examples/ibv_uc_pingpong \
- examples/ibv_ud_pingpong examples/ibv_srq_pingpong examples/ibv_xsrq_pingpong
+ examples/ibv_ud_pingpong examples/ibv_srq_pingpong examples/ibv_xsrq_pingpong \
+ examples/ibv_cc_pingpong
examples_ibv_devices_SOURCES = examples/device_list.c
examples_ibv_devices_LDADD = $(top_builddir)/src/libibverbs.la $(LIBNL_LIBS)
examples_ibv_devinfo_SOURCES = examples/devinfo.c
@@ -37,6 +38,8 @@ examples_ibv_srq_pingpong_SOURCES = examples/srq_pingpong.c examples/pingpong.c
examples_ibv_srq_pingpong_LDADD = $(top_builddir)/src/libibverbs.la $(LIBNL_LIBS)
examples_ibv_xsrq_pingpong_SOURCES = examples/xsrq_pingpong.c examples/pingpong.c
examples_ibv_xsrq_pingpong_LDADD = $(top_builddir)/src/libibverbs.la $(LIBNL_LIBS)
+examples_ibv_cc_pingpong_SOURCES = examples/cc_pingpong.c examples/pingpong.c
+examples_ibv_cc_pingpong_LDADD = $(top_builddir)/src/libibverbs.la $(LIBNL_LIBS)
examples_ibv_asyncwatch_SOURCES = examples/asyncwatch.c
examples_ibv_asyncwatch_LDADD = $(top_builddir)/src/libibverbs.la $(LIBNL_LIBS)
@@ -50,6 +53,7 @@ libibverbsinclude_HEADERS = include/infiniband/arch.h include/infiniband/driver.
man_MANS = man/ibv_asyncwatch.1 man/ibv_devices.1 man/ibv_devinfo.1 \
man/ibv_rc_pingpong.1 man/ibv_uc_pingpong.1 man/ibv_ud_pingpong.1 \
man/ibv_srq_pingpong.1 man/ibv_alloc_pd.3 man/ibv_attach_mcast.3 \
+ man/ibv_cc_pingpong.1 \
man/ibv_create_ah.3 man/ibv_create_ah_from_wc.3 \
man/ibv_create_comp_channel.3 man/ibv_create_cq.3 \
man/ibv_create_qp.3 man/ibv_create_srq.3 man/ibv_event_type_str.3 \
diff --git a/examples/.gitignore b/examples/.gitignore
index ebecbdc0cf56..1fe30466c884 100644
--- a/examples/.gitignore
+++ b/examples/.gitignore
@@ -7,4 +7,5 @@ ibv_srq_pingpong
ibv_uc_pingpong
ibv_ud_pingpong
ibv_xsrq_pingpong
+ibv_cc_pingpong
.libs
diff --git a/examples/cc_pingpong.c b/examples/cc_pingpong.c
new file mode 100644
index 000000000000..db7f3985f52e
--- /dev/null
+++ b/examples/cc_pingpong.c
@@ -0,0 +1,991 @@
+/* Copyright (c) 2005 Topspin Communications. All rights reserved.
+ * Copyright (c) 2009-2016 Mellanox Technologies. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the copyright holder nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
+ * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+ * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#if HAVE_CONFIG_H
+ #include <config.h>
+#endif /* HAVE_CONFIG_H */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/time.h>
+#include <netdb.h>
+#include <malloc.h>
+#include <getopt.h>
+#include <arpa/inet.h>
+#include <time.h>
+
+#include "config.h"
+#include "pingpong.h"
+
+enum {
+ PP_RECV_WRID = 1,
+ PP_SEND_WRID = 2,
+ PP_CQE_WAIT = 3,
+};
+
+static struct test_params app_params; // make command line args global
+
+struct pingpong_context {
+ struct ibv_context *context;
+ struct ibv_pd *pd;
+ struct ibv_mr *mr;
+ struct ibv_cq *scq;
+ struct ibv_cq *rcq;
+ struct ibv_qp *qp;
+
+ struct ibv_qp *mqp;
+ struct ibv_cq *mcq;
+
+ void *buf;
+ int size;
+ int rx_depth;
+
+ int scnt;
+ int rcnt;
+};
+
+struct pingpong_dest {
+ int lid;
+ int qpn;
+ int psn;
+};
+
+struct test_params {
+ int port;
+ int ib_port;
+ int size;
+ enum ibv_mtu mtu;
+ int rx_depth;
+ int iters;
+ int sl;
+ char ib_devname[128];
+ char servername[128];
+};
+
+void set_default_test_params(struct test_params *v)
+{
+ memset(v, 0, sizeof(struct test_params));
+ v->port = 18515;
+ v->ib_port = 1;
+ v->size = 4096;
+ v->mtu = IBV_MTU_1024;
+ v->rx_depth = 500;
+ v->iters = 1000;
+ v->sl = 0;
+}
+
+static int pp_connect_ctx(struct pingpong_context *ctx,
+ struct ibv_qp *qp,
+ int port,
+ int my_psn,
+ enum ibv_mtu mtu,
+ int sl,
+ struct pingpong_dest *dest)
+{
+ struct ibv_qp_attr attr = { 0 };
+ attr.qp_state = IBV_QPS_RTR,
+ attr.path_mtu = mtu,
+ attr.dest_qp_num = dest->qpn,
+ attr.rq_psn = dest->psn,
+ attr.max_dest_rd_atomic = 1,
+ attr.min_rnr_timer = 12,
+ attr.ah_attr.dlid = dest->lid,
+ attr.ah_attr.sl = sl,
+ attr.ah_attr.port_num = port;
+
+ if (ibv_modify_qp(qp, &attr,
+ IBV_QP_STATE |
+ IBV_QP_AV |
+ IBV_QP_PATH_MTU |
+ IBV_QP_DEST_QPN |
+ IBV_QP_RQ_PSN |
+ IBV_QP_MAX_DEST_RD_ATOMIC |
+ IBV_QP_MIN_RNR_TIMER)) {
+ fprintf(stderr, "Failed to modify QP to RTR\n");
+ return 1;
+ }
+
+ attr.qp_state = IBV_QPS_RTS;
+ attr.timeout = 14;
+ attr.retry_cnt = 7;
+ attr.rnr_retry = 7;
+ attr.sq_psn = my_psn;
+ attr.max_rd_atomic = 1;
+ if (ibv_modify_qp(qp, &attr,
+ IBV_QP_STATE |
+ IBV_QP_TIMEOUT |
+ IBV_QP_RETRY_CNT |
+ IBV_QP_RNR_RETRY |
+ IBV_QP_SQ_PSN |
+ IBV_QP_MAX_QP_RD_ATOMIC)) {
+ fprintf(stderr, "Failed to modify QP to RTS\n");
+ return 1;
+ }
+
+ return 0;
+}
+
+static struct pingpong_dest *pp_client_exch_dest(const char *servername,
+ int port,
+ const struct pingpong_dest *my_dest)
+{
+ struct addrinfo *res, *t;
+ struct addrinfo hints = {
+ .ai_family = AF_UNSPEC,
+ .ai_socktype = SOCK_STREAM
+ };
+ char *service;
+ char msg[sizeof "0000:000000:000000"];
+ int n;
+ int sockfd = -1;
+ struct pingpong_dest *rem_dest = NULL;
+
+ if (asprintf(&service, "%d", port) < 0)
+ return NULL;
+
+ n = getaddrinfo(servername, service, &hints, &res);
+ if (n < 0) {
+ fprintf(stderr, "%s for %s:%d\n", gai_strerror(n), servername, port);
+ free(service);
+ return NULL;
+ }
+
+ for (t = res; t; t = t->ai_next) {
+ sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol);
+ if (sockfd >= 0) {
+ if (!connect(sockfd, t->ai_addr, t->ai_addrlen))
+ break;
+ close(sockfd);
+ sockfd = -1;
+ }
+ }
+
+ freeaddrinfo(res);
+ free(service);
+
+ if (sockfd < 0) {
+ fprintf(stderr, "Couldn't connect to %s:%d\n", servername, port);
+ return NULL;
+ }
+
+ sprintf(msg, "%04x:%06x:%06x", my_dest->lid, my_dest->qpn, my_dest->psn);
+ if (write(sockfd, msg, sizeof msg) != sizeof msg) {
+ fprintf(stderr, "Couldn't send local address\n");
+ goto out;
+ }
+
+ if (read(sockfd, msg, sizeof msg) != sizeof msg) {
+ perror("client read");
+ fprintf(stderr, "Couldn't read remote address\n");
+ goto out;
+ }
+
+ if (write(sockfd, "done", sizeof "done") != sizeof("done")) {
+ fprintf(stderr, "Couldn't send \"done\" msg\n");
+ goto out;
+ }
+
+ rem_dest = malloc(sizeof *rem_dest);
+ if (!rem_dest)
+ goto out;
+
+ sscanf(msg, "%x:%x:%x", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn);
+
+out:
+ close(sockfd);
+ return rem_dest;
+}
+
+static struct pingpong_dest *pp_server_exch_dest(struct pingpong_context *ctx,
+ int ib_port,
+ enum ibv_mtu mtu,
+ int port,
+ int sl,
+ const struct pingpong_dest *my_dest)
+{
+ struct addrinfo *res, *t;
+ struct addrinfo hints = {
+ .ai_flags = AI_PASSIVE,
+ .ai_family = AF_UNSPEC,
+ .ai_socktype = SOCK_STREAM
+ };
+ char *service;
+ char msg[sizeof "0000:000000:000000"];
+ int n;
+ int sockfd = -1, connfd;
+ struct pingpong_dest *rem_dest = NULL;
+
+ if (asprintf(&service, "%d", port) < 0) {
+ return NULL;
+ }
+
+ n = getaddrinfo(NULL, service, &hints, &res);
+
+ if (n < 0) {
+ fprintf(stderr, "%s for port %d\n", gai_strerror(n), port);
+ free(service);
+ return NULL;
+ }
+
+ for (t = res; t; t = t->ai_next) {
+ sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol);
+ if (sockfd >= 0) {
+ n = 1;
+
+ setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &n, sizeof n);
+
+ if (!bind(sockfd, t->ai_addr, t->ai_addrlen))
+ break;
+
+ close(sockfd);
+ sockfd = -1;
+ }
+ }
+
+ freeaddrinfo(res);
+ free(service);
+
+ if (sockfd < 0) {
+ fprintf(stderr, "Couldn't listen to port %d\n", port);
+ return NULL;
+ }
+
+ listen(sockfd, 1);
+ connfd = accept(sockfd, NULL, 0);
+ close(sockfd);
+
+ if (connfd < 0) {
+ fprintf(stderr, "accept() failed\n");
+ return NULL;
+ }
+
+ n = read(connfd, msg, sizeof msg);
+ if (n != sizeof msg) {
+ perror("server read");
+ fprintf(stderr, "%d/%d: Couldn't read remote address\n", n, (int) sizeof msg);
+ goto out;
+ }
+
+ rem_dest = malloc(sizeof *rem_dest);
+ if (!rem_dest)
+ goto out;
+
+ sscanf(msg, "%x:%x:%x", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn);
+
+ if (pp_connect_ctx(ctx, ctx->qp, ib_port, my_dest->psn, mtu, sl, rem_dest)) {
+ fprintf(stderr, "Couldn't connect to remote QP\n");
+ free(rem_dest);
+ rem_dest = NULL;
+ goto out;
+ }
+
+ sprintf(msg, "%04x:%06x:%06x", my_dest->lid, my_dest->qpn, my_dest->psn);
+ if (write(connfd, msg, sizeof msg) != sizeof msg) {
+ fprintf(stderr, "Couldn't send local address\n");
+ free(rem_dest);
+ rem_dest = NULL;
+ goto out;
+ }
+
+ /* expecting msg "done" */
+ if (read(connfd, msg, sizeof(msg)) <= 0) {
+ fprintf(stderr, "Couldn't read \"done\" msg\n");
+ free(rem_dest);
+ rem_dest = NULL;
+ goto out;
+ }
+
+out:
+ close(connfd);
+ return rem_dest;
+}
+
+struct pingpong_dest *get_remote_dest(struct pingpong_context *ctx, int is_client,
+ struct pingpong_dest *my_dest)
+{
+ struct pingpong_dest *rem_dest = NULL;
+
+ if (is_client)
+ rem_dest = pp_client_exch_dest(app_params.servername, app_params.port, my_dest);
+ else
+ rem_dest = pp_server_exch_dest(ctx, app_params.ib_port, app_params.mtu, app_params.port, app_params.sl, my_dest);
+ return rem_dest;
+}
+
+static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, int rx_depth, int port)
+{
+ struct pingpong_context *ctx;
+ long page_size;
+
+ ctx = calloc(1, sizeof(*ctx));
+ if (!ctx)
+ return NULL;
+
+ ctx->size = size;
+ ctx->rx_depth = rx_depth;
+
+ page_size = sysconf(_SC_PAGESIZE);
+ ctx->buf = memalign(page_size, size);
+ if (!ctx->buf) {
+ fprintf(stderr, "Couldn't allocate work buf.\n");
+ goto clean_ctx;
+ }
+
+ memset(ctx->buf, 0, size);
+
+ ctx->context = ibv_open_device(ib_dev);
+ if (!ctx->context) {
+ fprintf(stderr, "Couldn't get context for %s\n",
+ ibv_get_device_name(ib_dev));
+ goto clean_buffer;
+ }
+
+ ctx->pd = ibv_alloc_pd(ctx->context);
+ if (!ctx->pd) {
+ fprintf(stderr, "Couldn't allocate PD\n");
+ goto clean_device;
+ }
+
+ ctx->mr = ibv_reg_mr(ctx->pd, ctx->buf, size, IBV_ACCESS_LOCAL_WRITE);
+ if (!ctx->mr) {
+ fprintf(stderr, "Couldn't register MR\n");
+ goto clean_pd;
+ }
+
+ {
+ struct ibv_create_cq_attr_ex attr = { 0 };
+ attr.comp_mask = IBV_CREATE_CQ_ATTR_FLAGS;
+ attr.flags = IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN;
+ attr.cqe = rx_depth;
+
+ ctx->rcq = ibv_create_cq_ex(ctx->context, &attr);
+ if (!ctx->rcq) {
+ fprintf(stderr, "Couldn't create RCQ\n");
+ goto clean_mr;
+ }
+ }
+
+ {
+ struct ibv_create_cq_attr_ex attr = { 0 };
+ attr.comp_mask = IBV_CREATE_CQ_ATTR_FLAGS;
+ attr.flags = IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN;
+ attr.cqe = 0x10;
+
+ ctx->scq = ibv_create_cq_ex(ctx->context, &attr);
+ if (!ctx->scq) {
+ fprintf(stderr, "Couldn't create SCQ\n");
+ goto clean_rcq;
+ }
+ }
+
+ {
+ struct ibv_qp_init_attr_ex attr = { 0 };
+ attr.send_cq = ctx->scq;
+ attr.recv_cq = ctx->rcq;
+ attr.cap.max_send_wr = 16;
+ attr.cap.max_recv_wr = rx_depth;
+ attr.cap.max_send_sge = 16;
+ attr.cap.max_recv_sge = 16;
+ attr.qp_type = IBV_QPT_RC;
+ attr.pd = ctx->pd;
+ attr.comp_mask |= IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_PD;
+ attr.create_flags = IBV_QP_CREATE_CROSS_CHANNEL | IBV_QP_CREATE_MANAGED_SEND;
+
+ ctx->qp = ibv_create_qp_ex(ctx->context, &attr);
+ if (!ctx->qp) {
+ fprintf(stderr, "Couldn't create QP\n");
+ goto clean_scq;
+ }
+ }
+
+ {
+ struct ibv_qp_attr attr = { 0 };
+ attr.qp_state = IBV_QPS_INIT;
+ attr.port_num = port;
+
+ if (ibv_modify_qp(ctx->qp, &attr,
+ IBV_QP_STATE |
+ IBV_QP_PKEY_INDEX |
+ IBV_QP_PORT |
+ IBV_QP_ACCESS_FLAGS)) {
+ fprintf(stderr, "Failed to modify QP to INIT\n");
+ goto clean_qp;
+ }
+ }
+
+
+ /* Create MQ */
+ ctx->mcq = ibv_create_cq(ctx->context, 0x40, NULL, NULL, 0);
+ if (!ctx->mcq) {
+ fprintf(stderr, "Couldn't create CQ\n");
+ goto clean_qp;
+ }
+
+ {
+ struct ibv_qp_init_attr_ex attr = { 0 };
+ attr.send_cq = ctx->mcq;
+ attr.recv_cq = ctx->mcq;
+ attr.cap.max_send_wr = 0x40;
+ attr.cap.max_send_sge = 1;
+ attr.cap.max_recv_sge = 1;
+ attr.qp_type = IBV_QPT_RC;
+ attr.pd = ctx->pd;
+ attr.comp_mask |= IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_PD;
+ attr.create_flags = IBV_QP_CREATE_CROSS_CHANNEL;
+ ctx->mqp = ibv_create_qp_ex(ctx->context, &attr);
+ if (!ctx->mqp) {
+ fprintf(stderr, "Couldn't create MQP\n");
+ goto clean_mcq;
+ }
+ }
+
+ {
+ struct ibv_qp_attr attr = { 0 };
+ attr.qp_state = IBV_QPS_INIT;
+ attr.port_num = port;
+
+ if (ibv_modify_qp(ctx->mqp, &attr,
+ IBV_QP_STATE |
+ IBV_QP_PKEY_INDEX |
+ IBV_QP_PORT |
+ IBV_QP_ACCESS_FLAGS)) {
+ fprintf(stderr, "Failed to modify QP to INIT\n");
+ goto clean_mqp;
+ }
+ }
+
+ {
+ struct ibv_qp_attr qp_attr = { 0 };
+ qp_attr.qp_state = IBV_QPS_RTR;
+ qp_attr.path_mtu = 1;
+ qp_attr.dest_qp_num = ctx->mqp->qp_num;
+ qp_attr.max_dest_rd_atomic = 1;
+ qp_attr.min_rnr_timer = 12;
+ qp_attr.ah_attr.port_num = port;
+
+ if (ibv_modify_qp(ctx->mqp, &qp_attr,
+ IBV_QP_STATE |
+ IBV_QP_AV |
+ IBV_QP_PATH_MTU |
+ IBV_QP_DEST_QPN |
+ IBV_QP_RQ_PSN |
+ IBV_QP_MAX_DEST_RD_ATOMIC |
+ IBV_QP_MIN_RNR_TIMER)) {
+ fprintf(stderr, "Failed to modify QP to RTR\n");
+ goto clean_mqp;
+ }
+
+ qp_attr.qp_state = IBV_QPS_RTS;
+ qp_attr.timeout = 14;
+ qp_attr.retry_cnt = 7;
+ qp_attr.rnr_retry = 7;
+ qp_attr.sq_psn = 0;
+ qp_attr.max_rd_atomic = 1;
+ if (ibv_modify_qp(ctx->mqp, &qp_attr,
+ IBV_QP_STATE |
+ IBV_QP_TIMEOUT |
+ IBV_QP_RETRY_CNT |
+ IBV_QP_RNR_RETRY |
+ IBV_QP_SQ_PSN |
+ IBV_QP_MAX_QP_RD_ATOMIC)) {
+ fprintf(stderr, "Failed to modify QP to RTS\n");
+ goto clean_mqp;
+ }
+ }
+
+ return ctx;
+
+clean_mqp:
+ ibv_destroy_qp(ctx->mqp);
+
+clean_mcq:
+ ibv_destroy_cq(ctx->mcq);
+
+clean_qp:
+ ibv_destroy_qp(ctx->qp);
+
+clean_scq:
+ ibv_destroy_cq(ctx->scq);
+
+clean_rcq:
+ ibv_destroy_cq(ctx->rcq);
+
+clean_mr:
+ ibv_dereg_mr(ctx->mr);
+
+clean_pd:
+ ibv_dealloc_pd(ctx->pd);
+
+clean_device:
+ ibv_close_device(ctx->context);
+
+clean_buffer:
+ free(ctx->buf);
+
+clean_ctx:
+ free(ctx);
+
+ return NULL;
+}
+
+int pp_close_ctx(struct pingpong_context *ctx)
+{
+ if (ibv_destroy_qp(ctx->mqp)) {
+ fprintf(stderr, "Couldn't destroy mQP\n");
+ return 1;
+ }
+
+ if (ibv_destroy_cq(ctx->mcq)) {
+ fprintf(stderr, "Couldn't destroy mCQ\n");
+ return 1;
+ }
+
+ if (ibv_destroy_qp(ctx->qp)) {
+ fprintf(stderr, "Couldn't destroy QP\n");
+ return 1;
+ }
+
+ if (ibv_destroy_cq(ctx->rcq)) {
+ fprintf(stderr, "Couldn't destroy rCQ\n");
+ return 1;
+ }
+
+ if (ibv_destroy_cq(ctx->scq)) {
+ fprintf(stderr, "Couldn't destroy sCQ\n");
+ return 1;
+ }
+
+ if (ibv_dereg_mr(ctx->mr)) {
+ fprintf(stderr, "Couldn't deregister MR\n");
+ return 1;
+ }
+
+ if (ibv_dealloc_pd(ctx->pd)) {
+ fprintf(stderr, "Couldn't deallocate PD\n");
+ return 1;
+ }
+
+ if (ibv_close_device(ctx->context)) {
+ fprintf(stderr, "Couldn't release context\n");
+ return 1;
+ }
+
+ free(ctx->buf);
+ free(ctx);
+
+ return 0;
+}
+
+static int pp_post_recv(struct pingpong_context *ctx, int n)
+{
+ int rc;
+
+ struct ibv_sge list = {
+ .addr = (uintptr_t) ctx->buf,
+ .length = ctx->size,
+ .lkey = ctx->mr->lkey
+ };
+ struct ibv_recv_wr wr = {
+ .wr_id = PP_RECV_WRID,
+ .sg_list = &list,
+ .num_sge = 1,
+ };
+ struct ibv_recv_wr *bad_wr;
+ int i;
+
+ for (i = 0; i < n; ++i) {
+ rc = ibv_post_recv(ctx->qp, &wr, &bad_wr);
+ if (rc)
+ return rc;
+ }
+
+ return i;
+}
+
+static int pp_post_send(struct pingpong_context *ctx, int wait_recv)
+{
+ int rc;
+ struct ibv_wc mwc;
+ struct ibv_wc wc;
+ struct ibv_send_wr *bad_wr;
+ int ne;
+
+ struct ibv_sge list = {
+ .addr = (uintptr_t) ctx->buf,
+ .length = ctx->size,
+ .lkey = ctx->mr->lkey
+ };
+
+ struct ibv_send_wr wr = {
+ .wr_id = PP_SEND_WRID,
+ .sg_list = &list,
+ .num_sge = 1,
+ .opcode = IBV_WR_SEND,
+ .send_flags = IBV_SEND_SIGNALED,
+ };
+
+ struct ibv_send_wr wr_en = {
+ .wr_id = wr.wr_id,
+ .sg_list = NULL,
+ .num_sge = 0,
+ .opcode = IBV_WR_SEND_ENABLE,
+ .send_flags = (wait_recv ? 0 : IBV_SEND_SIGNALED),
+ };
+
+ struct ibv_send_wr wr_wait = {
+ .wr_id = ctx->scnt,
+ .sg_list = NULL,
+ .num_sge = 0,
+ .opcode = IBV_WR_CQE_WAIT,
+ .send_flags = IBV_SEND_SIGNALED,
+ };
+ rc = ibv_post_send(ctx->qp, &wr, &bad_wr);
+ if (rc)
+ return rc;
+
+ /* fill in send work enable request */
+ wr_en.wr.wqe_enable.qp = ctx->qp;
+ wr_en.wr.wqe_enable.wqe_count = 0;
+ wr_en.send_flags |= IBV_SEND_WAIT_EN_LAST;
+ rc = ibv_post_send(ctx->mqp, &wr_en, &bad_wr);
+ if (rc)
+ return rc;
+
+ /* fill in wait work enable request */
+ if (wait_recv) {
+ wr_wait.wr.cqe_wait.cq = ctx->rcq;
+ wr_wait.wr.cqe_wait.cq_count = 1;
+ wr_wait.send_flags |= IBV_SEND_WAIT_EN_LAST;
+ wr_wait.next = NULL;
+
+ rc = ibv_post_send(ctx->mqp, &wr_wait, &bad_wr);
+ if (rc)
+ return rc;
+ }
+
+ do {
+ rc = ibv_poll_cq(ctx->mcq, 1, &mwc);
+ if (rc < 0)
+ return -1;
+ } while (rc == 0);
+
+ if (mwc.status != IBV_WC_SUCCESS)
+ return -1;
+
+ do {
+ ne = ibv_poll_cq(ctx->scq, 1, &wc);
+ if (ne < 0) {
+ fprintf(stderr, "poll CQ failed %d\n", ne);
+ return 1;
+ }
+ } while (!ne);
+
+ if (wc.status != IBV_WC_SUCCESS) {
+ fprintf(stderr, "cqe error status %s (%d v:%d) for count %d\n",
+ ibv_wc_status_str(wc.status),
+ wc.status, wc.vendor_err,
+ ctx->rcnt);
+ return 1;
+ }
+
+ return 0;
+}
+
+static void usage(const char *argv0)
+{
+ printf("Usage:\n");
+ printf(" %s start a server and wait for connection\n", argv0);
+ printf(" %s <host> connect to server at <host>\n", argv0);
+ printf("\n");
+ printf("Options:\n");
+ printf(" -p, --port=<port> listen on/connect to port <port> (default 18515)\n");
+ printf(" -d, --ib-dev=<dev> use IB device <dev> (default first device found)\n");
+ printf(" -i, --ib-port=<port> use port <port> of IB device (default 1)\n");
+ printf(" -s, --size=<size> size of message to exchange (default 4096 minimum 16)\n");
+ printf(" -m, --mtu=<size> path MTU (default 1024)\n");
+ printf(" -r, --rx-depth=<dep> number of receives to post at a time (default 500)\n");
+ printf(" -n, --iters=<iters> number of exchanges (default 1000)\n");
+ printf(" -l, --sl=<sl> service level value\n");
+}
+
+int parse_command_line_args(int argc, char*argv[], struct test_params * app_params)
+{
+ set_default_test_params(app_params);
+
+ while (1) {
+ int c;
+
+ static struct option long_options[] = {
+ { .name = "port", .has_arg = 1, .val = 'p' },
+ { .name = "ib-dev", .has_arg = 1, .val = 'd' },
+ { .name = "ib-port", .has_arg = 1, .val = 'i' },
+ { .name = "size", .has_arg = 1, .val = 's' },
+ { .name = "mtu", .has_arg = 1, .val = 'm' },
+ { .name = "rx-depth", .has_arg = 1, .val = 'r' },
+ { .name = "iters", .has_arg = 1, .val = 'n' },
+ { .name = "sl", .has_arg = 1, .val = 'l' },
+ { 0 }
+ };
+
+ c = getopt_long(argc, argv, "p:d:i:s:m:r:n:l:e",
+ long_options, NULL);
+ if (c == -1)
+ break;
+
+ switch (c) {
+ case 'p':
+ app_params->port = strtol(optarg, NULL, 0);
+ if (app_params->port < 0 || app_params->port > 65535) {
+ usage(argv[0]);
+ return 1;
+ }
+ break;
+
+ case 'd':
+ strncpy(app_params->ib_devname, optarg, sizeof(app_params->ib_devname));
+ break;
+
+ case 'i':
+ app_params->ib_port = strtol(optarg, NULL, 0);
+ if (app_params->ib_port < 0) {
+ usage(argv[0]);
+ return 1;
+ }
+ break;
+
+ case 's':
+ app_params->size = strtol(optarg, NULL, 0);
+ if (app_params->size < 16) {
+ usage(argv[0]);
+ return 1;
+ }
+ break;
+
+ case 'm':
+ app_params->mtu = pp_mtu_to_enum(strtol(optarg, NULL, 0));
+ if (app_params->mtu < 0) {
+ usage(argv[0]);
+ return 1;
+ }
+ break;
+
+ case 'r':
+ app_params->rx_depth = strtol(optarg, NULL, 0);
+ break;
+
+ case 'n':
+ app_params->iters = strtol(optarg, NULL, 0);
+ break;
+
+ case 'l':
+ app_params->sl = strtol(optarg, NULL, 0);
+ break;
+
+ default:
+ usage(argv[0]);
+ return 1;
+ }
+ }
+
+ if (optind == argc - 1) {
+ strncpy(app_params->servername, argv[optind], sizeof(app_params->servername));
+ }
+ else if (optind < argc) {
+ usage(argv[0]);
+ return 1;
+ }
+
+ return 0;
+}
+
+void dump_results(struct test_params * app_params, struct timeval *start, struct timeval *end)
+{
+ float usec = (end->tv_sec - start->tv_sec) * 1000000 + (end->tv_usec - start->tv_usec);
+ long long bytes = (long long) app_params->size * app_params->iters * 2;
+
+ printf("%lld bytes in %.2f seconds = %.2f Mbit/sec\n", bytes, usec / 1000000., bytes * 8. / usec);
+}
+
+int run_task_pingpong_app(int is_client)
+{
+ struct ibv_device **dev_list;
+ struct ibv_device *ib_dev = NULL;
+ struct pingpong_context *ctx;
+ struct pingpong_dest my_dest;
+ struct pingpong_dest *rem_dest;
+ struct timeval start, end;
+ char *ib_devname = NULL;
+ int ret = 0;
+ int routs;
+ int num_cq_events = 0;
+
+ srand48(getpid() * time(NULL));
+
+ dev_list = ibv_get_device_list(NULL);
+ if (!dev_list) {
+ fprintf(stderr, "No IB devices found\n");
+ return 1;
+ }
+
+ if (!ib_devname) {
+ ib_dev = *dev_list;
+ if (!ib_dev) {
+ fprintf(stderr, "No IB devices found\n");
+ return 1;
+ }
+ } else {
+ int i;
+ for (i = 0; dev_list[i]; ++i)
+ if (!strcmp(ibv_get_device_name(dev_list[i]),
+ ib_devname))
+ break;
+ ib_dev = dev_list[i];
+ if (!ib_dev) {
+ fprintf(stderr, "IB device %s not found\n", ib_devname);
+ return 1;
+ }
+ }
+
+ ctx = pp_init_ctx(ib_dev, app_params.size, app_params.rx_depth, app_params.ib_port);
+ if (!ctx)
+ return 1;
+
+ routs = pp_post_recv(ctx, ctx->rx_depth);
+ if (routs < ctx->rx_depth) {
+ fprintf(stderr, "Couldn't post receive (%d)\n", routs);
+ return 1;
+ }
+
+ my_dest.lid = pp_get_local_lid(ctx->context, app_params.ib_port);
+ my_dest.qpn = ctx->qp->qp_num;
+ my_dest.psn = lrand48() & 0xffffff;
+ if (!my_dest.lid) {
+ fprintf(stderr, "Couldn't get local LID\n");
+ return 1;
+ }
+
+ printf(" local address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x\n", my_dest.lid, my_dest.qpn, my_dest.psn);
+
+ rem_dest = (struct pingpong_dest *) get_remote_dest(ctx, is_client, &my_dest);
+ if (rem_dest == NULL) {
+ fprintf(stderr, "Failed to exchange data with remote destination\n");
+ return 1;
+ }
+
+ if (!rem_dest)
+ return 1;
+
+ printf(" remote address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x\n", rem_dest->lid, rem_dest->qpn, rem_dest->psn);
+
+ if (is_client) {
+ if (pp_connect_ctx(ctx, ctx->qp, app_params.ib_port, my_dest.psn, app_params.mtu, app_params.sl, rem_dest))
+ return 1;
+ if (pp_post_send(ctx, 0)) {
+ fprintf(stderr, "Couldn't post send\n");
+ return 1;
+ }
+ }
+
+ if (gettimeofday(&start, NULL)) {
+ perror("gettimeofday");
+ return 1;
+ }
+
+ ctx->scnt = ctx->rcnt = 0;
+ while (ctx->rcnt < app_params.iters && ctx->scnt < app_params.iters) {
+ struct ibv_wc wc;
+ int ne;
+
+ do {
+ ne = ibv_poll_cq(ctx->rcq, 1, &wc);
+ if (ne < 0) {
+ fprintf(stderr, "poll CQ failed %d\n", ne);
+ return 1;
+ }
+ } while (ne < 1);
+
+ if (wc.status != IBV_WC_SUCCESS) {
+ fprintf(stderr, "cqe error status %s (%d v:%d)"
+ " for count %d\n",
+ ibv_wc_status_str(wc.status),
+ wc.status, wc.vendor_err,
+ ctx->rcnt);
+ return 1;
+ }
+
+ ctx->rcnt++;
+
+ if (pp_post_recv(ctx, 1) < 0) {
+ fprintf(stderr, "Couldn't post receive\n");
+ return 1;
+ }
+
+ if (pp_post_send(ctx, 1)) {
+ fprintf(stderr, "Couldn't post send\n");
+ return 1;
+ }
+ }
+
+ if (gettimeofday(&end, NULL)) {
+ perror("gettimeofday");
+ return 1;
+ }
+
+ dump_results(&app_params, &start, &end);
+
+ ibv_ack_cq_events(ctx->rcq, num_cq_events);
+
+ if (pp_close_ctx(ctx))
+ return 1;
+
+ ibv_free_device_list(dev_list);
+
+ free(rem_dest);
+
+ return ret;
+}
+
+int main(int argc, char **argv)
+{
+ int ret;
+ int is_client;
+
+ ret = parse_command_line_args(argc, argv, &app_params);
+ if (ret != 0) {
+ fprintf(stderr, "Error parsing command line arguments");
+ exit(0);
+ }
+
+ is_client = (app_params.servername[0] != 0);
+ return run_task_pingpong_app(is_client);
+}
diff --git a/man/ibv_cc_pingpong.1 b/man/ibv_cc_pingpong.1
new file mode 100644
index 000000000000..89f887b99d25
--- /dev/null
+++ b/man/ibv_cc_pingpong.1
@@ -0,0 +1,66 @@
+.TH IBV_CC_PINGPONG 1 2015-12-20 "libibverbs" "USER COMMANDS"
+
+.SH NAME
+ibv_cc_pingpong \- Simple infiniband cross-channel synchronization test
+
+.SH SYNOPSIS
+.B ibv_cc_pingpong
+[\-p port] [\-d device] [\-i ib port] [\-s size] [\-r rx depth]
+[\-n iters] [\-l sl] [\-m mtu]
+\fBHOSTNAME\fR
+
+.B ibv_cc_pingpong
+[\-p port] [\-d device] [\-i ib port] [\-s size] [\-r rx depth]
+[\-n iters] [\-l sl] [\-m mtu]
+
+.SH DESCRIPTION
+.PP
+Run a simple ping-pong test over InfiniBand via the reliable
+connected (RC) transport based on cross-channel synchronization
+work requests.
+
+This application demonstrates usage of CQE wait and SEND enable
+primitives. The server posts work requests and client waits
+for one completion.
+
+.SH OPTIONS
+
+.PP
+.TP
+\fB\-p\fR, \fB\-\-port\fR=\fIPORT\fR
+use TCP port \fIPORT\fR for initial synchronization (default 18515)
+.TP
+\fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR
+use IB device \fIDEVICE\fR (default first device found)
+.TP
+\fB\-i\fR, \fB\-\-ib\-port\fR=\fIPORT\fR
+use IB port \fIPORT\fR (default port 1)
+.TP
+\fB\-s\fR, \fB\-\-size\fR=\fISIZE\fR
+ping-pong messages of size \fISIZE\fR (default 4096)
+.TP
+\fB\-r\fR, \fB\-\-rx\-depth\fR=\fIDEPTH\fR
+post \fIDEPTH\fR receives at a time (default 1000)
+.TP
+\fB\-n\fR, \fB\-\-iters\fR=\fIITERS\fR
+perform \fIITERS\fR message exchanges (default 1000)
+.TP
+\fB\-l\fR, \fB\-\-sl\fR=\fISL\fR
+use \fISL\fR as the service level value of the QP (default 0)
+.TP
+\fB\-m\fR, \fB\-\-mtu\fR=\fISIZE\fR
+path MTU (default 4096)
+
+.SH SEE ALSO
+.BR ibv_rc_pingpong (1),
+.BR ibv_uc_pingpong (1),
+.BR ibv_ud_pingpong (1),
+.BR ibv_srq_pingpong (1)
+
+.SH AUTHORS
+.TP
+Leon Romanovsky
+.RI < leon.romanovsky-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org >
+.TP
+Roland Dreier
+.RI < rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org >
--
1.7.12.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-01-16 15:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-16 15:53 [PATCH V1 libibverbs 0/5] Add cross-channel support Leon Romanovsky
[not found] ` <1452959624-29454-1-git-send-email-leon-2ukJVAZIZ/Y@public.gmane.org>
2016-01-16 15:53 ` [PATCH libibverbs V1 1/5] Add CQ ignore overrun flag Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 2/5] Add cross-channel QP flags Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 3/5] Add cross-channel work requests primitives Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 4/5] Export cross-channel capability flag Leon Romanovsky
2016-01-16 15:53 ` [PATCH libibverbs V1 5/5] Add an example of cross-channel synchronization Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).