* RE: [PATCH v4 4/8] nvme-rdma: use rdma connection reject helper functions
From: Steve Wise @ 2016-10-26 19:39 UTC (permalink / raw)
To: 'Bart Van Assche', dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <62db5521-9c15-146b-9057-2a22658fa210-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
>
> If you have to resend this patch series please address these comments.
>
> Thanks,
>
> Bart.
Hey Bart, shall I add a reviewed-by tag from you for this patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH v5 6/8] rds_rdma: log the connection reject message
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
net/rds/rdma_transport.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/rds/rdma_transport.c b/net/rds/rdma_transport.c
index 345f090..d5f3117 100644
--- a/net/rds/rdma_transport.c
+++ b/net/rds/rdma_transport.c
@@ -100,11 +100,14 @@ int rds_rdma_cm_event_handler(struct rdma_cm_id *cm_id,
trans->cm_connect_complete(conn, event);
break;
+ case RDMA_CM_EVENT_REJECTED:
+ rdsdebug("Connection rejected: %s\n",
+ rdma_reject_msg(cm_id, event->status));
+ /* FALLTHROUGH */
case RDMA_CM_EVENT_ADDR_ERROR:
case RDMA_CM_EVENT_ROUTE_ERROR:
case RDMA_CM_EVENT_CONNECT_ERROR:
case RDMA_CM_EVENT_UNREACHABLE:
- case RDMA_CM_EVENT_REJECTED:
case RDMA_CM_EVENT_DEVICE_REMOVAL:
case RDMA_CM_EVENT_ADDR_CHANGE:
if (conn)
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 8/8] nvmet_rdma: log the connection reject message
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Acked-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/nvme/target/rdma.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 1cbe6e0..8315224 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -1358,6 +1358,9 @@ static int nvmet_rdma_cm_handler(struct rdma_cm_id *cm_id,
ret = nvmet_rdma_device_removal(cm_id, queue);
break;
case RDMA_CM_EVENT_REJECTED:
+ pr_debug("Connection rejected: %s\n",
+ rdma_reject_msg(cm_id, event->status));
+ /* FALLTHROUGH */
case RDMA_CM_EVENT_UNREACHABLE:
case RDMA_CM_EVENT_CONNECT_ERROR:
nvmet_rdma_queue_connect_fail(cm_id, queue);
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 7/8] ib_isert: log the connection reject message
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Acked-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/ulp/isert/ib_isert.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index cae9bbc..5331272 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -795,6 +795,8 @@ isert_cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *event)
*/
return 1;
case RDMA_CM_EVENT_REJECTED: /* FALLTHRU */
+ isert_info("Connection rejected: %s\n",
+ rdma_reject_msg(cma_id, event->status));
case RDMA_CM_EVENT_UNREACHABLE: /* FALLTHRU */
case RDMA_CM_EVENT_CONNECT_ERROR:
ret = isert_connect_error(cma_id);
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 5/8] ib_iser: log the connection reject message
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Acked-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/ulp/iser/iser_verbs.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
index 1b49453..6eeffb2 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -906,11 +906,14 @@ static int iser_cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *eve
case RDMA_CM_EVENT_ESTABLISHED:
iser_connected_handler(cma_id, event->param.conn.private_data);
break;
+ case RDMA_CM_EVENT_REJECTED:
+ iser_info("Connection rejected: %s\n",
+ rdma_reject_msg(cma_id, event->status));
+ /* FALLTHROUGH */
case RDMA_CM_EVENT_ADDR_ERROR:
case RDMA_CM_EVENT_ROUTE_ERROR:
case RDMA_CM_EVENT_CONNECT_ERROR:
case RDMA_CM_EVENT_UNREACHABLE:
- case RDMA_CM_EVENT_REJECTED:
iser_connect_error(cma_id);
break;
case RDMA_CM_EVENT_DISCONNECTED:
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 4/8] nvme-rdma: use rdma connection reject helper functions
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Also add nvme cm status strings and use them.
Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/nvme/host/rdma.c | 42 ++++++++++++++++++++++++++++++++++++------
1 file changed, 36 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index fbdb226..fcf2079 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -43,6 +43,28 @@
#define NVME_RDMA_MAX_INLINE_SEGMENTS 1
+static const char *const nvme_rdma_cm_status_strs[] = {
+ [NVME_RDMA_CM_INVALID_LEN] = "invalid length",
+ [NVME_RDMA_CM_INVALID_RECFMT] = "invalid record format",
+ [NVME_RDMA_CM_INVALID_QID] = "invalid queue ID",
+ [NVME_RDMA_CM_INVALID_HSQSIZE] = "invalid host SQ size",
+ [NVME_RDMA_CM_INVALID_HRQSIZE] = "invalid host RQ size",
+ [NVME_RDMA_CM_NO_RSC] = "resource not found",
+ [NVME_RDMA_CM_INVALID_IRD] = "invalid IRD",
+ [NVME_RDMA_CM_INVALID_ORD] = "Invalid ORD",
+};
+
+static const char *nvme_rdma_cm_msg(enum nvme_rdma_cm_status status)
+{
+ size_t index = status;
+
+ if (index < ARRAY_SIZE(nvme_rdma_cm_status_strs) &&
+ nvme_rdma_cm_status_strs[index])
+ return nvme_rdma_cm_status_strs[index];
+ else
+ return "unrecognized reason";
+};
+
/*
* We handle AEN commands ourselves and don't even let the
* block layer know about them.
@@ -1222,16 +1244,24 @@ out_destroy_queue_ib:
static int nvme_rdma_conn_rejected(struct nvme_rdma_queue *queue,
struct rdma_cm_event *ev)
{
- if (ev->param.conn.private_data_len) {
- struct nvme_rdma_cm_rej *rej =
- (struct nvme_rdma_cm_rej *)ev->param.conn.private_data;
+ struct rdma_cm_id *cm_id = queue->cm_id;
+ int status = ev->status;
+ const char *rej_msg;
+ const struct nvme_rdma_cm_rej *rej_data;
+ u8 rej_data_len;
+
+ rej_msg = rdma_reject_msg(cm_id, status);
+ rej_data = rdma_consumer_reject_data(cm_id, ev, &rej_data_len);
+
+ if (rej_data && rej_data_len >= sizeof(u16)) {
+ u16 sts = le16_to_cpu(rej_data->sts);
dev_err(queue->ctrl->ctrl.device,
- "Connect rejected, status %d.", le16_to_cpu(rej->sts));
- /* XXX: Think of something clever to do here... */
+ "Connect rejected: status %d (%s) nvme status %d (%s).\n",
+ status, rej_msg, sts, nvme_rdma_cm_msg(sts));
} else {
dev_err(queue->ctrl->ctrl.device,
- "Connect rejected, no private data.\n");
+ "Connect rejected: status %d (%s).\n", status, rej_msg);
}
return -ECONNRESET;
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 3/8] rdma_cm: add rdma_consumer_reject_data helper function
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
rdma_consumer_reject_data() will return the private data pointer
and length if any is available.
Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/core/cma.c | 16 ++++++++++++++++
include/rdma/rdma_cm.h | 10 ++++++++++
2 files changed, 26 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 71d2a06..8399149 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -128,6 +128,22 @@ bool rdma_is_consumer_reject(struct rdma_cm_id *id, int reason)
}
EXPORT_SYMBOL(rdma_is_consumer_reject);
+const void *rdma_consumer_reject_data(struct rdma_cm_id *id,
+ struct rdma_cm_event *ev, u8 *data_len)
+{
+ const void *p;
+
+ if (rdma_is_consumer_reject(id, ev->status)) {
+ *data_len = ev->param.conn.private_data_len;
+ p = ev->param.conn.private_data;
+ } else {
+ *data_len = 0;
+ p = NULL;
+ }
+ return p;
+}
+EXPORT_SYMBOL(rdma_consumer_reject_data);
+
static void cma_add_one(struct ib_device *device);
static void cma_remove_one(struct ib_device *device, void *client_data);
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 62039c2..d3968b5 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -403,4 +403,14 @@ const char *__attribute_const__ rdma_reject_msg(struct rdma_cm_id *id,
*/
bool rdma_is_consumer_reject(struct rdma_cm_id *id, int reason);
+/**
+ * rdma_consumer_reject_data - return the consumer reject private data and
+ * length, if any.
+ * @id: Communication identifier that received the REJECT event.
+ * @ev: RDMA CM reject event.
+ * @data_len: Pointer to the resulting length of the consumer data.
+ */
+const void *rdma_consumer_reject_data(struct rdma_cm_id *id,
+ struct rdma_cm_event *ev, u8 *data_len);
+
#endif /* RDMA_CM_H */
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 2/8] rdma_cm: add rdma_is_consumer_reject() helper function
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Return true if the peer consumer application rejected the
connection attempt.
Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/core/cma.c | 13 +++++++++++++
include/rdma/rdma_cm.h | 7 +++++++
2 files changed, 20 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 427f74e..71d2a06 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -115,6 +115,19 @@ const char *__attribute_const__ rdma_reject_msg(struct rdma_cm_id *id,
}
EXPORT_SYMBOL(rdma_reject_msg);
+bool rdma_is_consumer_reject(struct rdma_cm_id *id, int reason)
+{
+ if (rdma_ib_or_roce(id->device, id->port_num))
+ return reason == IB_CM_REJ_CONSUMER_DEFINED;
+
+ if (rdma_protocol_iwarp(id->device, id->port_num))
+ return reason == -ECONNREFUSED;
+
+ WARN_ON_ONCE(1);
+ return false;
+}
+EXPORT_SYMBOL(rdma_is_consumer_reject);
+
static void cma_add_one(struct ib_device *device);
static void cma_remove_one(struct ib_device *device, void *client_data);
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index f11a768..62039c2 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -395,5 +395,12 @@ __be64 rdma_get_service_id(struct rdma_cm_id *id, struct sockaddr *addr);
*/
const char *__attribute_const__ rdma_reject_msg(struct rdma_cm_id *id,
int reason);
+/**
+ * rdma_is_consumer_reject - return true if the consumer rejected the connect
+ * request.
+ * @id: Communication identifier that received the REJECT event.
+ * @reason: Value returned in the REJECT event status field.
+ */
+bool rdma_is_consumer_reject(struct rdma_cm_id *id, int reason);
#endif /* RDMA_CM_H */
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH v5 1/8] rdma_cm: add rdma_reject_msg() helper function
From: Steve Wise @ 2016-10-26 19:36 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <cover.1477510827.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
rdma_reject_msg() returns a pointer to a string message associated with
the transport reject reason codes.
Reviewed-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/core/cm.c | 48 ++++++++++++++++++++++++++++++++++++++++++
drivers/infiniband/core/cma.c | 14 ++++++++++++
drivers/infiniband/core/iwcm.c | 21 ++++++++++++++++++
include/rdma/ib_cm.h | 6 ++++++
include/rdma/iw_cm.h | 6 ++++++
include/rdma/rdma_cm.h | 8 +++++++
6 files changed, 103 insertions(+)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index c995255..6c64d0c 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -57,6 +57,54 @@ MODULE_AUTHOR("Sean Hefty");
MODULE_DESCRIPTION("InfiniBand CM");
MODULE_LICENSE("Dual BSD/GPL");
+static const char * const ibcm_rej_reason_strs[] = {
+ [IB_CM_REJ_NO_QP] = "no QP",
+ [IB_CM_REJ_NO_EEC] = "no EEC",
+ [IB_CM_REJ_NO_RESOURCES] = "no resources",
+ [IB_CM_REJ_TIMEOUT] = "timeout",
+ [IB_CM_REJ_UNSUPPORTED] = "unsupported",
+ [IB_CM_REJ_INVALID_COMM_ID] = "invalid comm ID",
+ [IB_CM_REJ_INVALID_COMM_INSTANCE] = "invalid comm instance",
+ [IB_CM_REJ_INVALID_SERVICE_ID] = "invalid service ID",
+ [IB_CM_REJ_INVALID_TRANSPORT_TYPE] = "invalid transport type",
+ [IB_CM_REJ_STALE_CONN] = "stale conn",
+ [IB_CM_REJ_RDC_NOT_EXIST] = "RDC not exist",
+ [IB_CM_REJ_INVALID_GID] = "invalid GID",
+ [IB_CM_REJ_INVALID_LID] = "invalid LID",
+ [IB_CM_REJ_INVALID_SL] = "invalid SL",
+ [IB_CM_REJ_INVALID_TRAFFIC_CLASS] = "invalid traffic class",
+ [IB_CM_REJ_INVALID_HOP_LIMIT] = "invalid hop limit",
+ [IB_CM_REJ_INVALID_PACKET_RATE] = "invalid packet rate",
+ [IB_CM_REJ_INVALID_ALT_GID] = "invalid alt GID",
+ [IB_CM_REJ_INVALID_ALT_LID] = "invalid alt LID",
+ [IB_CM_REJ_INVALID_ALT_SL] = "invalid alt SL",
+ [IB_CM_REJ_INVALID_ALT_TRAFFIC_CLASS] = "invalid alt traffic class",
+ [IB_CM_REJ_INVALID_ALT_HOP_LIMIT] = "invalid alt hop limit",
+ [IB_CM_REJ_INVALID_ALT_PACKET_RATE] = "invalid alt packet rate",
+ [IB_CM_REJ_PORT_CM_REDIRECT] = "port CM redirect",
+ [IB_CM_REJ_PORT_REDIRECT] = "port redirect",
+ [IB_CM_REJ_INVALID_MTU] = "invalid MTU",
+ [IB_CM_REJ_INSUFFICIENT_RESP_RESOURCES] = "insufficient resp resources",
+ [IB_CM_REJ_CONSUMER_DEFINED] = "consumer defined",
+ [IB_CM_REJ_INVALID_RNR_RETRY] = "invalid RNR retry",
+ [IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID] = "duplicate local comm ID",
+ [IB_CM_REJ_INVALID_CLASS_VERSION] = "invalid class version",
+ [IB_CM_REJ_INVALID_FLOW_LABEL] = "invalid flow label",
+ [IB_CM_REJ_INVALID_ALT_FLOW_LABEL] = "invalid alt flow label",
+};
+
+const char *__attribute_const__ ibcm_reject_msg(int reason)
+{
+ size_t index = reason;
+
+ if (index < ARRAY_SIZE(ibcm_rej_reason_strs) &&
+ ibcm_rej_reason_strs[index])
+ return ibcm_rej_reason_strs[index];
+ else
+ return "unrecognized reason";
+}
+EXPORT_SYMBOL(ibcm_reject_msg);
+
static void cm_add_one(struct ib_device *device);
static void cm_remove_one(struct ib_device *device, void *client_data);
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 5f65a78..427f74e 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -101,6 +101,20 @@ const char *__attribute_const__ rdma_event_msg(enum rdma_cm_event_type event)
}
EXPORT_SYMBOL(rdma_event_msg);
+const char *__attribute_const__ rdma_reject_msg(struct rdma_cm_id *id,
+ int reason)
+{
+ if (rdma_ib_or_roce(id->device, id->port_num))
+ return ibcm_reject_msg(reason);
+
+ if (rdma_protocol_iwarp(id->device, id->port_num))
+ return iwcm_reject_msg(reason);
+
+ WARN_ON_ONCE(1);
+ return "unrecognized transport";
+}
+EXPORT_SYMBOL(rdma_reject_msg);
+
static void cma_add_one(struct ib_device *device);
static void cma_remove_one(struct ib_device *device, void *client_data);
diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c
index 357624f..c8721c54 100644
--- a/drivers/infiniband/core/iwcm.c
+++ b/drivers/infiniband/core/iwcm.c
@@ -59,6 +59,27 @@ MODULE_AUTHOR("Tom Tucker");
MODULE_DESCRIPTION("iWARP CM");
MODULE_LICENSE("Dual BSD/GPL");
+static const char * const iwcm_rej_reason_strs[] = {
+ [ECONNRESET] = "reset by remote host",
+ [ECONNREFUSED] = "refused by remote application",
+ [ETIMEDOUT] = "setup timeout",
+};
+
+const char *__attribute_const__ iwcm_reject_msg(int reason)
+{
+ size_t index;
+
+ /* iWARP uses negative errnos */
+ index = -reason;
+
+ if (index < ARRAY_SIZE(iwcm_rej_reason_strs) &&
+ iwcm_rej_reason_strs[index])
+ return iwcm_rej_reason_strs[index];
+ else
+ return "unrecognized reason";
+}
+EXPORT_SYMBOL(iwcm_reject_msg);
+
static struct ibnl_client_cbs iwcm_nl_cb_table[] = {
[RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb},
[RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
diff --git a/include/rdma/ib_cm.h b/include/rdma/ib_cm.h
index 92a7d85..b49258b 100644
--- a/include/rdma/ib_cm.h
+++ b/include/rdma/ib_cm.h
@@ -603,4 +603,10 @@ struct ib_cm_sidr_rep_param {
int ib_send_cm_sidr_rep(struct ib_cm_id *cm_id,
struct ib_cm_sidr_rep_param *param);
+/**
+ * ibcm_reject_msg - return a pointer to a reject message string.
+ * @reason: Value returned in the REJECT event status field.
+ */
+const char *__attribute_const__ ibcm_reject_msg(int reason);
+
#endif /* IB_CM_H */
diff --git a/include/rdma/iw_cm.h b/include/rdma/iw_cm.h
index 6d0065c..5cd7701 100644
--- a/include/rdma/iw_cm.h
+++ b/include/rdma/iw_cm.h
@@ -253,4 +253,10 @@ int iw_cm_disconnect(struct iw_cm_id *cm_id, int abrupt);
int iw_cm_init_qp_attr(struct iw_cm_id *cm_id, struct ib_qp_attr *qp_attr,
int *qp_attr_mask);
+/**
+ * iwcm_reject_msg - return a pointer to a reject message string.
+ * @reason: Value returned in the REJECT event status field.
+ */
+const char *__attribute_const__ iwcm_reject_msg(int reason);
+
#endif /* IW_CM_H */
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 81fb1d1..f11a768 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -388,4 +388,12 @@ int rdma_set_afonly(struct rdma_cm_id *id, int afonly);
*/
__be64 rdma_get_service_id(struct rdma_cm_id *id, struct sockaddr *addr);
+/**
+ * rdma_reject_msg - return a pointer to a reject message string.
+ * @id: Communication identifier that received the REJECT event.
+ * @reason: Value returned in the REJECT event status field.
+ */
+const char *__attribute_const__ rdma_reject_msg(struct rdma_cm_id *id,
+ int reason);
+
#endif /* RDMA_CM_H */
--
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH rdma-core 7/7] libhns: Add consolidated repo for userspace library of hns
From: Jason Gunthorpe @ 2016-10-26 16:33 UTC (permalink / raw)
To: Lijun Ou
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-8-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
On Wed, Oct 26, 2016 at 09:04:08PM +0800, Lijun Ou wrote:
> +CHECK_C_SOURCE_COMPILES("
> +#ifndef __ARM64__
> +#error Failed
> +#endif
> + int main(int argc,const char *argv[]) { return 1; }"
> + HAVE_ARCH_ARM64)
> +
> +if (HAVE_ARCH_ARM64)
I don't see a compilation problem on x86, so please do not do
this. For maintainability we need all providers to compile on x86.
For now just drop in a '# FIXME: Kernel driver only builds on ARM64'
and maybe we can optimize things someday to always build but not
install the .so
> +rdma_provider(hns
> + hns_roce_u_buf.c
> + hns_roce_u.c
> + hns_roce_u_hw_v1.c
> + hns_roce_u_verbs.c
List should be sorted
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-core 5/7] libhns: Add verbs of qp support
From: Jason Gunthorpe @ 2016-10-26 16:23 UTC (permalink / raw)
To: Lijun Ou
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-6-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
On Wed, Oct 26, 2016 at 09:04:06PM +0800, Lijun Ou wrote:
> +#ifndef min
> +#define min(a, b) \
> + ({ typeof (a) _a = (a); \
> + typeof (b) _b = (b); \
> + _a < _b ? _a : _b; })
> +#endif
Nope, use the ccan header
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH rdma-core 1/7] libhns: Add initial main frame
From: Jason Gunthorpe @ 2016-10-26 16:20 UTC (permalink / raw)
To: Lijun Ou
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-2-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
On Wed, Oct 26, 2016 at 09:04:02PM +0800, Lijun Ou wrote:
> +static struct ibv_device *hns_roce_driver_init(const char *uverbs_sys_path,
> + int abi_version)
> +{
> + struct hns_roce_device *dev;
> + char value[128];
> + int i;
> +
> + if (ibv_read_sysfs_file(uverbs_sys_path, "device/modalias",
> + value, sizeof(value)) > 0)
> + for (i = 0; i < sizeof(acpi_table) / sizeof(acpi_table[0]); ++i)
> + if (!strcmp(value, acpi_table[i].hid))
> + goto found;
You shouldn't need to do both modalias and compatible, there should be
an acceptable modalias for the DT version too.
But I wonder if this isn't generically better to be
last_dir(readlink("device/driver")) == "hns"
instead?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [PATCH v4 4/8] nvme-rdma: use rdma connection reject helper functions
From: Steve Wise @ 2016-10-26 16:17 UTC (permalink / raw)
To: 'Bart Van Assche', dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <62db5521-9c15-146b-9057-2a22658fa210-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
>
> On 10/25/2016 12:34 PM, Steve Wise wrote:
> > + rej_data = (struct nvme_rdma_cm_rej *)
> > + rdma_consumer_reject_data(cm_id, ev, &rej_data_len);
>
> This cast casts away constness; that's ugly. If the data type of
> rej_data would be changed into const ... * then no cast would have been
> necessary.
>
Ok.
> > + if (rej_data && rej_data_len >= sizeof(u16)) {
> > + u16 sts = le16_to_cpu(rej_data->sts);
> >
> > dev_err(queue->ctrl->ctrl.device,
> > - "Connect rejected, status %d.", le16_to_cpu(rej-
> >sts));
> > - /* XXX: Think of something clever to do here... */
> > - } else {
> > + "Connect rejected: status %d (%s) nvme status %d
> (%s).\n",
> > + status, rej_msg, sts, nvme_rdma_cm_msg(sts));
> > + } else
> > dev_err(queue->ctrl->ctrl.device,
> > - "Connect rejected, no private data.\n");
> > - }
> > + "Connect rejected: status %d (%s).\n", status,
> rej_msg);
>
> Braces are not balanced for this if-then-else statement :-(
>
> If you have to resend this patch series please address these comments.
>
Sure, I'll spin another version with these changes. This is destined for
4.10 so we have time. :)
Thanks for reviewing!
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [For help] rdma-roce build quesiton
From: Jason Gunthorpe @ 2016-10-26 16:09 UTC (permalink / raw)
To: oulijun; +Cc: Linuxarm, linux-rdma
In-Reply-To: <581059FA.6070507-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
On Wed, Oct 26, 2016 at 03:23:38PM +0800, oulijun wrote:
> the build is fail and the print log as follows:
>
> error: size of unnamed array is negative
> attr->cap.max_recv_wr = min(context->max_qp_wr, attr->cap.max_recv_wr);
It is telling you the types are not the same, and this is a source of bugs
as C has some counter intuitive rules regarding type promotion.
1) Audit max_qp_wr and max_recv_wr to see if they really should be
different types, if not fix context->max_qp_wr to match
2) If they are legitimately different then use
min_t(<desired type>, context->max_qp_wr, attr->cap.max_recv_wr);
Think carefully about what common type is used because both arguments
will be casted, and the goal is to avoid a loss of precision or
signdedness in the cast.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] Avoid possible hang on device removal
From: Jason Gunthorpe @ 2016-10-26 16:03 UTC (permalink / raw)
To: Steve Wise
Cc: 'Mustafa Ismail', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb,
dledford-H+wXaHxf7aLQT0dZR+AlfA, leon-DgEjT+Ai2ygdnm+yROfE0A
In-Reply-To: <064101d22f8f$3b6e7850$b24b68f0$@opengridcomputing.com>
On Wed, Oct 26, 2016 at 08:45:33AM -0500, Steve Wise wrote:
> > > Reviewed-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> > >
> > > Also, this fixes a previous commit, so you could add this tag:
> > >
> > > Fixes 612eae1f6fe3 ("rping: ignore flushed completions")
> > >
> > > However that commit id is from the previous librdmacm repo...not
> > > sure how useful it is?
> >
> > It still exists in the new repo, I'd include the fixes line.
> >
> > Jason
>
> Hey Jason, how do I see the history?
>
> [root@stevo1 rdma-core]# git log --pretty=oneline --abbrev-commit
> librdmacm/examples/rping.c
> 9024876 rdmacm: Use correct format specifier for size_t
> 400122e Be explicit about _GNU_SOURCE
> 663098b Rename librdmacm
Add --follow
$ git log --follow --pretty=oneline --abbrev-commit librdmacm/examples/rping.c
44579aeb63ee Enable -Wwrite-strings
90248767130f rdmacm: Use correct format specifier for size_t
400122ef15bd Be explicit about _GNU_SOURCE
663098bfc3ac Rename librdmacm
171176797516 [v1, 1/1, librdmacm] examples/rping.c: fix unwanted abort during qp creation
c9ac6566b26c [librdmacm] examples: Use gai_strerror rather than perror for [rdma_]getaddrinfo failures
5c5bd081e37a rping: create persistent server threads in DETACHED state
612eae1f6fe3 rping: ignore flushed completions
5ae36aba6f95 rping: Fixes race, where ibv context was getting freed before memory was deregistered
b70a390d8bd8 rping: Fix server reporting error on exit
e57196c71ddd [5/5,librdmacm] rping: added checks to the return values functions
860b1a8784f1 rping: Reduce retry_count to fit in 3-bits
5658ff385e04 rping: Replace sprintf with snprintf to protect from buffer overflow
93635fa33b41 librdmacm/rping: Make sure CQ event thread exits before destroying the CQ
8c6aeb3e70bb RPING: Remove printf for FLUSH completion.
4e33a4109a62 librdmacm: returns errors from the library consistently
267c28a2f03b rping: add ipv6 support
ea9b03238b13 librdmacm/rping: allow specifying hostnames in place of IP addresses
163f48d410fa librdmacm/rping: fix duplicate usage message
308b811d8d6b librdmacm: implement address change event
1beed21b111e librdmacm/examples
[..]
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: mlx4 RoCE mode without OFED?
From: Robert LeBlanc @ 2016-10-26 15:53 UTC (permalink / raw)
To: Matan Barak; +Cc: linux-rdma
In-Reply-To: <CAAKD3BCA=dOKxb-h5+VRp0DG=CN0e4Xj91hLgXSHtqAt=apMkQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
That commit is from linux-stable [0] introduced in 4.5-rc1 and may be
more of an internal thing. It looks like RDMA is able to work on the
4.8.4 kernel, but not on the 4.4.24 kernel (what we are using in
production currently). We are interested in RoCE v2, does this mean
that the non-Pro card can't do RoCE v2 even with OFED? We have some
ConnectX-4 Lx cards that we are testing specifically for RoCE v2, but
it would be nice if we move to RoCE to re-purpose the IB cards.
Thanks,
Robert LeBlanc
[0] http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed, Oct 26, 2016 at 5:53 AM, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On Tue, Oct 25, 2016 at 11:28 PM, Robert LeBlanc <robert-4JaGZRWAfWbajFs6igw21g@public.gmane.org> wrote:
>> I've been trying to get our ConnectX-3 card to run RoCE (Ethernet mode
>> is working) [0], but I can't pass the roce_mode to the mlx4_core
>> module even with 4.8.4. Do you have to use OFED to use RoCE with
>> ConnectX-3? According to the spec sheet, the card should support RoCE
>
> roce_mode parameter is OFED only. When you use upstream, ConnectX-3 supports
> RoCE v1 automatically while ConnectX-3 pro supports both RoCE v1 and RoCE v2.
> You don't have to specify any module parameter for that.
>
>> [1]. We generally run 4.4.x kernel and so getting OFED to compile is
>> quite the challenge, so we are looking for an upstream solution. It
>> seems that commit 3afd8362fabd167bb04f79501f21dd67aa9cb99f added some
>> bits to add roce_mode to the module, but I don't see it in the module.
>>
>
> I couldn't find this commit. Is it OFED?
>
>> Linux localhost 4.8.4 #3 SMP Tue Oct 25 13:28:15 MDT 2016 x86_64
>> x86_64 x86_64 GNU/Linux
>>
>> # mstflint -d 02:00.0 q
>> Image type: FS2
>> FW Version: 2.35.5100
>> Rom Info: type=PXE version=3.4.648 devid=4099
>> Device ID: 4099
>> Description: Node Port1 Port2 Sys image
>> GUIDs: 0cc47affff4fe9fc 0cc47affff4fe9fd 0cc47affff4fe9fe
>> 0cc47affff4fe9ff
>> MACs: 0cc47a4fe9fd 0cc47a4fe9fe
>> VSD: n/a
>> PSID: SM_2221000001000
>>
>> [Tue Oct 25 13:40:21 2016] mlx4_core: unknown parameter 'roce_mode' ignored
>> [Tue Oct 25 13:40:21 2016] mlx4_core: unknown parameter 'roce_mode' ignored
>>
>> # modinfo mlx4_core
>> filename:
>> /lib/modules/4.8.4/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko
>> version: 2.2-1
>> license: Dual BSD/GPL
>> description: Mellanox ConnectX HCA low-level driver
>> author: Roland Dreier
>> srcversion: BB58E84E637E4E5EC69D04E
>> alias: pci:v000015B3d00001010sv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000100Fsv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000100Esv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000100Dsv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000100Csv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000100Bsv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000100Asv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001009sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001008sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001007sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001006sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001005sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001004sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001003sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00001002sv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000676Esv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006746sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006764sv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000675Asv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006372sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006750sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006368sv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000673Csv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006732sv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006354sv*sd*bc*sc*i*
>> alias: pci:v000015B3d0000634Asv*sd*bc*sc*i*
>> alias: pci:v000015B3d00006340sv*sd*bc*sc*i*
>> depends:
>> intree: Y
>> vermagic: 4.8.4 SMP mod_unload
>> parm: debug_level:Enable debug tracing if > 0 (int)
>> parm: msi_x:attempt to use MSI-X if nonzero (int)
>> parm: num_vfs:enable #num_vfs functions if num_vfs > 0
>> num_vfs=port1,port2,port1+2 (array of byte)
>> parm: probe_vf:number of vfs to probe by pf driver (num_vfs > 0)
>> probe_vf=port1,port2,port1+2 (array of byte)
>> parm: log_num_mgm_entry_size:log mgm size, that defines the
>> num of qp per mcg, for example: 10 gives 248.range: 7 <=
>> log_num_mgm_entry_size <= 12. To activate device managed flow steering
>> when available, set to
>> -1 (int)
>> parm: enable_64b_cqe_eqe:Enable 64 byte CQEs/EQEs when the
>> FW supports this (default: True) (bool)
>> parm: enable_4k_uar:Enable using 4K UAR. Should not be
>> enabled if have VFs which do not support 4K UARs (default: false)
>> (bool)
>> parm: log_num_mac:Log2 max number of MACs per ETH port (1-7) (int)
>> parm: log_num_vlan:Log2 max number of VLANs per ETH port (0-7) (int)
>> parm: use_prio:Enable steering by VLAN priority on ETH ports
>> (deprecated) (bool)
>> parm: log_mtts_per_seg:Log2 number of MTT entries per
>> segment (1-7) (int)
>> parm: port_type_array:Array of port types: HW_DEFAULT (0) is
>> default 1 for IB, 2 for Ethernet (array of int)
>> parm: enable_qos:Enable Enhanced QoS support (default: on) (bool)
>> parm: internal_err_reset:Reset device on internal errors if
>> non-zero (default 1) (int)
>>
>> Thanks,
>> Robert LeBlanc
>>
>
> Regards,
> Matan
>
>> [0] https://community.mellanox.com/docs/DOC-1444
>> [1] http://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX3_VPI_Card.pdf
>> ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v4 4/8] nvme-rdma: use rdma connection reject helper functions
From: Bart Van Assche @ 2016-10-26 15:26 UTC (permalink / raw)
To: Steve Wise, dledford-H+wXaHxf7aLQT0dZR+AlfA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
sagi-NQWnxTmZq1alnMjI0IkVqw, hch-jcswGhMUV9g, axboe-b10kYP2dOMg,
santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA
In-Reply-To: <35db2feb4dcac92924992d5655630aa70972ad00.1477426743.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
On 10/25/2016 12:34 PM, Steve Wise wrote:
> + rej_data = (struct nvme_rdma_cm_rej *)
> + rdma_consumer_reject_data(cm_id, ev, &rej_data_len);
This cast casts away constness; that's ugly. If the data type of
rej_data would be changed into const ... * then no cast would have been
necessary.
> + if (rej_data && rej_data_len >= sizeof(u16)) {
> + u16 sts = le16_to_cpu(rej_data->sts);
>
> dev_err(queue->ctrl->ctrl.device,
> - "Connect rejected, status %d.", le16_to_cpu(rej->sts));
> - /* XXX: Think of something clever to do here... */
> - } else {
> + "Connect rejected: status %d (%s) nvme status %d (%s).\n",
> + status, rej_msg, sts, nvme_rdma_cm_msg(sts));
> + } else
> dev_err(queue->ctrl->ctrl.device,
> - "Connect rejected, no private data.\n");
> - }
> + "Connect rejected: status %d (%s).\n", status, rej_msg);
Braces are not balanced for this if-then-else statement :-(
If you have to resend this patch series please address these comments.
Thanks,
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [PATCH] Avoid possible hang on device removal
From: Steve Wise @ 2016-10-26 13:45 UTC (permalink / raw)
To: 'Jason Gunthorpe'
Cc: 'Mustafa Ismail', linux-rdma-u79uwXL29TY76Z2rM5mHXA,
sean.hefty-ral2JQCrhuEAvxtiuMwx3w,
hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb,
dledford-H+wXaHxf7aLQT0dZR+AlfA, leon-DgEjT+Ai2ygdnm+yROfE0A
In-Reply-To: <20161025224318.GA7816-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
> > Reviewed-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> >
> > Also, this fixes a previous commit, so you could add this tag:
> >
> > Fixes 612eae1f6fe3 ("rping: ignore flushed completions")
> >
> > However that commit id is from the previous librdmacm repo...not
> > sure how useful it is?
>
> It still exists in the new repo, I'd include the fixes line.
>
> Jason
Hey Jason, how do I see the history?
[root@stevo1 rdma-core]# git log --pretty=oneline --abbrev-commit
librdmacm/examples/rping.c
9024876 rdmacm: Use correct format specifier for size_t
400122e Be explicit about _GNU_SOURCE
663098b Rename librdmacm
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 0/3] iopmem : A block device for PCIe memory
From: Dan Williams @ 2016-10-26 13:39 UTC (permalink / raw)
To: Haggai Eran
Cc: Jason Gunthorpe, sbates-Rgftl6RXld5BDgjK7y7TUQ, Raj, Ashok,
linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, David Woodhouse,
Jonathan Corbet,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
jim.macdonald-FgSLVYC75IpWk0Htik3J/w, Stephen Bates,
linux-block-u79uwXL29TY76Z2rM5mHXA, Linux MM, Jens Axboe,
Christoph Hellwig
In-Reply-To: <a5418089-2615-8c04-aca8-50ceb43978f1-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
On Wed, Oct 26, 2016 at 1:24 AM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
[..]
>> I wonder if we could (ab)use a
>> software-defined 'pasid' as the requester id for a peer-to-peer
>> mapping that needs address translation.
> Why would you need that? Isn't it enough to map the peer-to-peer
> addresses correctly in the iommu driver?
>
You're right, we might already have enough...
We would just need to audit iommu drivers to undo any assumptions that
the page being mapped is always in host memory and apply any bus
address translations between source device and target device.
^ permalink raw reply
* [bug report] qedr: Add GSI support
From: Dan Carpenter @ 2016-10-26 13:25 UTC (permalink / raw)
To: Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hello Ram Amrani,
This is a semi-automatic email about new static checker warnings.
The patch 048867793046: "qedr: Add GSI support" from Oct 10, 2016,
leads to the following Smatch complaint:
drivers/infiniband/hw/qedr/qedr_cm.c:284 qedr_gsi_build_header()
warn: variable dereferenced before check 'sgid_attr.ndev' (see line 281)
drivers/infiniband/hw/qedr/qedr_cm.c
280
281 vlan_id = rdma_vlan_dev_vlan_id(sgid_attr.ndev);
^^^^^^^^^^^^^^
Dereference inside function.
282 if (vlan_id < VLAN_CFI_MASK)
283 has_vlan = true;
284 if (sgid_attr.ndev)
^^^^^^^^^^^^^^
Check too late.
285 dev_put(sgid_attr.ndev);
286
regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH rdma-core 7/7] libhns: Add consolidated repo for userspace library of hns
From: Lijun Ou @ 2016-10-26 13:04 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch configures the consolidated repo to build userspace
library of hns(libhns).
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
CMakeLists.txt | 1 +
MAINTAINERS | 6 ++++++
README.md | 1 +
providers/hns/CMakeLists.txt | 15 +++++++++++++++
4 files changed, 23 insertions(+)
create mode 100644 providers/hns/CMakeLists.txt
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 230aab5..5ce8e15 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -328,6 +328,7 @@ add_subdirectory(libibcm)
add_subdirectory(providers/cxgb3)
add_subdirectory(providers/cxgb4)
add_subdirectory(providers/hfi1verbs)
+add_subdirectory(providers/hns)
add_subdirectory(providers/i40iw)
add_subdirectory(providers/ipathverbs)
add_subdirectory(providers/mlx4)
diff --git a/MAINTAINERS b/MAINTAINERS
index d83de10..bc6eb50 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -57,6 +57,12 @@ S: Supported
L: intel-opa-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org (moderated for non-subscribers)
F: providers/hfi1verbs/
+HNS USERSPACE PROVIDER (for hns-roce.ko)
+M: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
+M: Wei Hu(Xavier) <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
+S: Supported
+F: providers/hns/
+
I40IW USERSPACE PROVIDER (for i40iw.ko)
M: Tatyana Nikolova <Tatyana.E.Nikolova-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
S: Supported
diff --git a/README.md b/README.md
index 3a13042..e3bc33f 100644
--- a/README.md
+++ b/README.md
@@ -18,6 +18,7 @@ is included:
- iw_cxgb3.ko
- iw_cxgb4.ko
- hfi1.ko
+ - hns-roce.ko
- i40iw.ko
- ib_qib.ko
- mlx4_ib.ko
diff --git a/providers/hns/CMakeLists.txt b/providers/hns/CMakeLists.txt
new file mode 100644
index 0000000..3a47f3d
--- /dev/null
+++ b/providers/hns/CMakeLists.txt
@@ -0,0 +1,15 @@
+CHECK_C_SOURCE_COMPILES("
+#ifndef __ARM64__
+#error Failed
+#endif
+ int main(int argc,const char *argv[]) { return 1; }"
+ HAVE_ARCH_ARM64)
+
+if (HAVE_ARCH_ARM64)
+rdma_provider(hns
+ hns_roce_u_buf.c
+ hns_roce_u.c
+ hns_roce_u_hw_v1.c
+ hns_roce_u_verbs.c
+)
+endif()
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH rdma-core 6/7] libhns: Add verbs of post_send and post_recv support
From: Lijun Ou @ 2016-10-26 13:04 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the verbs of posting send
and psoting recv.
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
providers/hns/hns_roce_u.c | 2 +
providers/hns/hns_roce_u.h | 8 +
providers/hns/hns_roce_u_hw_v1.c | 314 +++++++++++++++++++++++++++++++++++++++
providers/hns/hns_roce_u_hw_v1.h | 79 ++++++++++
4 files changed, 403 insertions(+)
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index 30f8678..bceed84 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -131,6 +131,8 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.query_qp = hns_roce_u_query_qp;
context->ibv_ctx.ops.modify_qp = hr_dev->u_hw->modify_qp;
context->ibv_ctx.ops.destroy_qp = hr_dev->u_hw->destroy_qp;
+ context->ibv_ctx.ops.post_send = hr_dev->u_hw->post_send;
+ context->ibv_ctx.ops.post_recv = hr_dev->u_hw->post_recv;
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index d412e87..c77fba0 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -51,6 +51,10 @@
#define PFX "hns: "
+#ifndef likely
+#define likely(x) __builtin_expect(!!(x), 1)
+#endif
+
#ifndef min
#define min(a, b) \
({ typeof (a) _a = (a); \
@@ -178,6 +182,10 @@ struct hns_roce_qp {
struct hns_roce_u_hw {
int (*poll_cq)(struct ibv_cq *ibvcq, int ne, struct ibv_wc *wc);
int (*arm_cq)(struct ibv_cq *ibvcq, int solicited);
+ int (*post_send)(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+ struct ibv_send_wr **bad_wr);
+ int (*post_recv)(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
+ struct ibv_recv_wr **bad_wr);
int (*modify_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr,
int attr_mask);
int (*destroy_qp)(struct ibv_qp *ibqp);
diff --git a/providers/hns/hns_roce_u_hw_v1.c b/providers/hns/hns_roce_u_hw_v1.c
index fb81634..a3aad1c 100644
--- a/providers/hns/hns_roce_u_hw_v1.c
+++ b/providers/hns/hns_roce_u_hw_v1.c
@@ -37,6 +37,59 @@
#include "hns_roce_u_hw_v1.h"
#include "hns_roce_u.h"
+static inline void set_raddr_seg(struct hns_roce_wqe_raddr_seg *rseg,
+ uint64_t remote_addr, uint32_t rkey)
+{
+ rseg->raddr = remote_addr;
+ rseg->rkey = rkey;
+ rseg->len = 0;
+}
+
+static void set_data_seg(struct hns_roce_wqe_data_seg *dseg, struct ibv_sge *sg)
+{
+
+ dseg->lkey = sg->lkey;
+ dseg->addr = sg->addr;
+ dseg->len = sg->length;
+}
+
+static void hns_roce_update_rq_head(struct hns_roce_context *ctx,
+ unsigned int qpn, unsigned int rq_head)
+{
+ struct hns_roce_rq_db rq_db;
+
+ rq_db.u32_4 = 0;
+ rq_db.u32_8 = 0;
+
+ roce_set_field(rq_db.u32_4, RQ_DB_U32_4_RQ_HEAD_M,
+ RQ_DB_U32_4_RQ_HEAD_S, rq_head);
+ roce_set_field(rq_db.u32_8, RQ_DB_U32_8_QPN_M, RQ_DB_U32_8_QPN_S, qpn);
+ roce_set_field(rq_db.u32_8, RQ_DB_U32_8_CMD_M, RQ_DB_U32_8_CMD_S, 1);
+ roce_set_bit(rq_db.u32_8, RQ_DB_U32_8_HW_SYNC_S, 1);
+
+ hns_roce_write64((uint32_t *)&rq_db, ctx, ROCEE_DB_OTHERS_L_0_REG);
+}
+
+static void hns_roce_update_sq_head(struct hns_roce_context *ctx,
+ unsigned int qpn, unsigned int port,
+ unsigned int sl, unsigned int sq_head)
+{
+ struct hns_roce_sq_db sq_db;
+
+ sq_db.u32_4 = 0;
+ sq_db.u32_8 = 0;
+
+ roce_set_field(sq_db.u32_4, SQ_DB_U32_4_SQ_HEAD_M,
+ SQ_DB_U32_4_SQ_HEAD_S, sq_head);
+ roce_set_field(sq_db.u32_4, SQ_DB_U32_4_PORT_M, SQ_DB_U32_4_PORT_S,
+ port);
+ roce_set_field(sq_db.u32_4, SQ_DB_U32_4_SL_M, SQ_DB_U32_4_SL_S, sl);
+ roce_set_field(sq_db.u32_8, SQ_DB_U32_8_QPN_M, SQ_DB_U32_8_QPN_S, qpn);
+ roce_set_bit(sq_db.u32_8, SQ_DB_U32_8_HW_SYNC, 1);
+
+ hns_roce_write64((uint32_t *)&sq_db, ctx, ROCEE_DB_SQ_L_0_REG);
+}
+
static void hns_roce_update_cq_cons_index(struct hns_roce_context *ctx,
struct hns_roce_cq *cq)
{
@@ -126,6 +179,16 @@ static struct hns_roce_cqe *next_cqe_sw(struct hns_roce_cq *cq)
return get_sw_cqe(cq, cq->cons_index);
}
+static void *get_recv_wqe(struct hns_roce_qp *qp, int n)
+{
+ if ((n < 0) || (n > qp->rq.wqe_cnt)) {
+ printf("rq wqe index:%d,rq wqe cnt:%d\r\n", n, qp->rq.wqe_cnt);
+ return NULL;
+ }
+
+ return qp->buf.buf + qp->rq.offset + (n << qp->rq.wqe_shift);
+}
+
static void *get_send_wqe(struct hns_roce_qp *qp, int n)
{
if ((n < 0) || (n > qp->sq.wqe_cnt)) {
@@ -137,6 +200,26 @@ static void *get_send_wqe(struct hns_roce_qp *qp, int n)
(n << qp->sq.wqe_shift));
}
+static int hns_roce_wq_overflow(struct hns_roce_wq *wq, int nreq,
+ struct hns_roce_cq *cq)
+{
+ unsigned int cur;
+
+ cur = wq->head - wq->tail;
+ if (cur + nreq < wq->max_post)
+ return 0;
+
+ /* While the num of wqe exceeds cap of the device, cq will be locked */
+ pthread_spin_lock(&cq->lock);
+ cur = wq->head - wq->tail;
+ pthread_spin_unlock(&cq->lock);
+
+ printf("wq:(head = %d, tail = %d, max_post = %d), nreq = 0x%x\n",
+ wq->head, wq->tail, wq->max_post, nreq);
+
+ return cur + nreq >= wq->max_post;
+}
+
static struct hns_roce_qp *hns_roce_find_qp(struct hns_roce_context *ctx,
uint32_t qpn)
{
@@ -374,6 +457,144 @@ static int hns_roce_u_v1_arm_cq(struct ibv_cq *ibvcq, int solicited)
return 0;
}
+static int hns_roce_u_v1_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr,
+ struct ibv_send_wr **bad_wr)
+{
+ unsigned int ind;
+ void *wqe;
+ int nreq;
+ int ps_opcode, i;
+ int ret = 0;
+ struct hns_roce_wqe_ctrl_seg *ctrl = NULL;
+ struct hns_roce_wqe_data_seg *dseg = NULL;
+ struct hns_roce_qp *qp = to_hr_qp(ibvqp);
+ struct hns_roce_context *ctx = to_hr_ctx(ibvqp->context);
+
+ pthread_spin_lock(&qp->sq.lock);
+
+ /* check that state is OK to post send */
+ ind = qp->sq.head;
+
+ for (nreq = 0; wr; ++nreq, wr = wr->next) {
+ if (hns_roce_wq_overflow(&qp->sq, nreq,
+ to_hr_cq(qp->ibv_qp.send_cq))) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+ if (wr->num_sge > qp->sq.max_gs) {
+ ret = -1;
+ *bad_wr = wr;
+ printf("wr->num_sge(<=%d) = %d, check failed!\r\n",
+ qp->sq.max_gs, wr->num_sge);
+ goto out;
+ }
+
+ ctrl = wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1));
+ memset(ctrl, 0, sizeof(struct hns_roce_wqe_ctrl_seg));
+
+ qp->sq.wrid[ind & (qp->sq.wqe_cnt - 1)] = wr->wr_id;
+ for (i = 0; i < wr->num_sge; i++)
+ ctrl->msg_length += wr->sg_list[i].length;
+
+
+ ctrl->flag |= ((wr->send_flags & IBV_SEND_SIGNALED) ?
+ HNS_ROCE_WQE_CQ_NOTIFY : 0) |
+ (wr->send_flags & IBV_SEND_SOLICITED ?
+ HNS_ROCE_WQE_SE : 0) |
+ ((wr->opcode == IBV_WR_SEND_WITH_IMM ||
+ wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) ?
+ HNS_ROCE_WQE_IMM : 0) |
+ (wr->send_flags & IBV_SEND_FENCE ?
+ HNS_ROCE_WQE_FENCE : 0);
+
+ if (wr->opcode == IBV_WR_SEND_WITH_IMM ||
+ wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM)
+ ctrl->imm_data = wr->imm_data;
+
+ wqe += sizeof(struct hns_roce_wqe_ctrl_seg);
+
+ /* set remote addr segment */
+ switch (ibvqp->qp_type) {
+ case IBV_QPT_RC:
+ switch (wr->opcode) {
+ case IBV_WR_RDMA_READ:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_RDMA_READ;
+ set_raddr_seg(wqe, wr->wr.rdma.remote_addr,
+ wr->wr.rdma.rkey);
+ break;
+ case IBV_WR_RDMA_WRITE:
+ case IBV_WR_RDMA_WRITE_WITH_IMM:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_RDMA_WRITE;
+ set_raddr_seg(wqe, wr->wr.rdma.remote_addr,
+ wr->wr.rdma.rkey);
+ break;
+ case IBV_WR_SEND:
+ case IBV_WR_SEND_WITH_IMM:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_SEND;
+ break;
+ case IBV_WR_ATOMIC_CMP_AND_SWP:
+ case IBV_WR_ATOMIC_FETCH_AND_ADD:
+ default:
+ ps_opcode = HNS_ROCE_WQE_OPCODE_MASK;
+ break;
+ }
+ ctrl->flag |= (ps_opcode);
+ wqe += sizeof(struct hns_roce_wqe_raddr_seg);
+ break;
+ case IBV_QPT_UC:
+ case IBV_QPT_UD:
+ default:
+ break;
+ }
+
+ dseg = wqe;
+
+ /* Inline */
+ if (wr->send_flags & IBV_SEND_INLINE && wr->num_sge) {
+ if (ctrl->msg_length > qp->max_inline_data) {
+ ret = -1;
+ *bad_wr = wr;
+ printf("inline data len(1-32)=%d, send_flags = 0x%x, check failed!\r\n",
+ wr->send_flags, ctrl->msg_length);
+ return ret;
+ }
+
+ for (i = 0; i < wr->num_sge; i++) {
+ memcpy(wqe,
+ ((void *) (uintptr_t) wr->sg_list[i].addr),
+ wr->sg_list[i].length);
+ wqe = wqe + wr->sg_list[i].length;
+ }
+
+ ctrl->flag |= HNS_ROCE_WQE_INLINE;
+ } else {
+ /* set sge */
+ for (i = 0; i < wr->num_sge; i++)
+ set_data_seg(dseg+i, wr->sg_list + i);
+
+ ctrl->flag |= wr->num_sge << HNS_ROCE_WQE_SGE_NUM_BIT;
+ }
+
+ ind++;
+ }
+
+out:
+ /* Set DB return */
+ if (likely(nreq)) {
+ qp->sq.head += nreq;
+ wmb();
+
+ hns_roce_update_sq_head(ctx, qp->ibv_qp.qp_num,
+ qp->port_num - 1, qp->sl,
+ qp->sq.head & ((qp->sq.wqe_cnt << 1) - 1));
+ }
+
+ pthread_spin_unlock(&qp->sq.lock);
+
+ return ret;
+}
+
static void __hns_roce_v1_cq_clean(struct hns_roce_cq *cq, uint32_t qpn,
struct hns_roce_srq *srq)
{
@@ -517,9 +738,102 @@ static int hns_roce_u_v1_destroy_qp(struct ibv_qp *ibqp)
return ret;
}
+static int hns_roce_u_v1_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr,
+ struct ibv_recv_wr **bad_wr)
+{
+ int ret = 0;
+ int nreq;
+ int ind;
+ struct ibv_sge *sg;
+ struct hns_roce_rc_rq_wqe *rq_wqe;
+ struct hns_roce_qp *qp = to_hr_qp(ibvqp);
+ struct hns_roce_context *ctx = to_hr_ctx(ibvqp->context);
+
+ pthread_spin_lock(&qp->rq.lock);
+
+ /* check that state is OK to post receive */
+ ind = qp->rq.head & (qp->rq.wqe_cnt - 1);
+
+ for (nreq = 0; wr; ++nreq, wr = wr->next) {
+ if (hns_roce_wq_overflow(&qp->rq, nreq,
+ to_hr_cq(qp->ibv_qp.recv_cq))) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ if (wr->num_sge > qp->rq.max_gs) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ rq_wqe = get_recv_wqe(qp, ind);
+ if (wr->num_sge > HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM) {
+ ret = -1;
+ *bad_wr = wr;
+ goto out;
+ }
+
+ if (wr->num_sge == HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM) {
+ roce_set_field(rq_wqe->u32_2,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_M,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_S,
+ HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM);
+ sg = wr->sg_list;
+
+ rq_wqe->va0 = (sg->addr);
+ rq_wqe->l_key0 = (sg->lkey);
+ rq_wqe->length0 = (sg->length);
+
+ sg = wr->sg_list + 1;
+
+ rq_wqe->va1 = (sg->addr);
+ rq_wqe->l_key1 = (sg->lkey);
+ rq_wqe->length1 = (sg->length);
+ } else if (wr->num_sge == HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 1) {
+ roce_set_field(rq_wqe->u32_2,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_M,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_S,
+ HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 1);
+ sg = wr->sg_list;
+
+ rq_wqe->va0 = (sg->addr);
+ rq_wqe->l_key0 = (sg->lkey);
+ rq_wqe->length0 = (sg->length);
+
+ } else if (wr->num_sge == HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 2) {
+ roce_set_field(rq_wqe->u32_2,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_M,
+ RC_RQ_WQE_NUMBER_OF_DATA_SEG_S,
+ HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM - 2);
+ }
+
+ qp->rq.wrid[ind] = wr->wr_id;
+
+ ind = (ind + 1) & (qp->rq.wqe_cnt - 1);
+ }
+
+out:
+ if (nreq) {
+ qp->rq.head += nreq;
+
+ wmb();
+
+ hns_roce_update_rq_head(ctx, qp->ibv_qp.qp_num,
+ qp->rq.head & ((qp->rq.wqe_cnt << 1) - 1));
+ }
+
+ pthread_spin_unlock(&qp->rq.lock);
+
+ return ret;
+}
+
struct hns_roce_u_hw hns_roce_u_hw_v1 = {
.poll_cq = hns_roce_u_v1_poll_cq,
.arm_cq = hns_roce_u_v1_arm_cq,
+ .post_send = hns_roce_u_v1_post_send,
+ .post_recv = hns_roce_u_v1_post_recv,
.modify_qp = hns_roce_u_v1_modify_qp,
.destroy_qp = hns_roce_u_v1_destroy_qp,
};
diff --git a/providers/hns/hns_roce_u_hw_v1.h b/providers/hns/hns_roce_u_hw_v1.h
index b249f54..128c66f 100644
--- a/providers/hns/hns_roce_u_hw_v1.h
+++ b/providers/hns/hns_roce_u_hw_v1.h
@@ -39,9 +39,15 @@
#define HNS_ROCE_CQE_IS_SQ 0
#define HNS_ROCE_RC_WQE_INLINE_DATA_MAX_LEN 32
+#define HNS_ROCE_RC_RQ_WQE_MAX_SGE_NUM 2
enum {
+ HNS_ROCE_WQE_INLINE = 1 << 31,
+ HNS_ROCE_WQE_SE = 1 << 30,
+ HNS_ROCE_WQE_SGE_NUM_BIT = 24,
HNS_ROCE_WQE_IMM = 1 << 23,
+ HNS_ROCE_WQE_FENCE = 1 << 21,
+ HNS_ROCE_WQE_CQ_NOTIFY = 1 << 20,
HNS_ROCE_WQE_OPCODE_SEND = 0 << 16,
HNS_ROCE_WQE_OPCODE_RDMA_READ = 1 << 16,
HNS_ROCE_WQE_OPCODE_RDMA_WRITE = 2 << 16,
@@ -52,6 +58,20 @@ enum {
struct hns_roce_wqe_ctrl_seg {
__be32 sgl_pa_h;
__be32 flag;
+ __be32 imm_data;
+ __be32 msg_length;
+};
+
+struct hns_roce_wqe_data_seg {
+ __be64 addr;
+ __be32 lkey;
+ __be32 len;
+};
+
+struct hns_roce_wqe_raddr_seg {
+ __be32 rkey;
+ __be32 len;
+ __be64 raddr;
};
enum {
@@ -102,6 +122,43 @@ struct hns_roce_cq_db {
#define CQ_DB_U32_8_HW_SYNC_S 31
+struct hns_roce_rq_db {
+ unsigned int u32_4;
+ unsigned int u32_8;
+};
+
+#define RQ_DB_U32_4_RQ_HEAD_S 0
+#define RQ_DB_U32_4_RQ_HEAD_M (((1UL << 15) - 1) << RQ_DB_U32_4_RQ_HEAD_S)
+
+#define RQ_DB_U32_8_QPN_S 0
+#define RQ_DB_U32_8_QPN_M (((1UL << 24) - 1) << RQ_DB_U32_8_QPN_S)
+
+#define RQ_DB_U32_8_CMD_S 28
+#define RQ_DB_U32_8_CMD_M (((1UL << 3) - 1) << RQ_DB_U32_8_CMD_S)
+
+#define RQ_DB_U32_8_HW_SYNC_S 31
+
+struct hns_roce_sq_db {
+ unsigned int u32_4;
+ unsigned int u32_8;
+};
+
+#define SQ_DB_U32_4_SQ_HEAD_S 0
+#define SQ_DB_U32_4_SQ_HEAD_M (((1UL << 15) - 1) << SQ_DB_U32_4_SQ_HEAD_S)
+
+#define SQ_DB_U32_4_SL_S 16
+#define SQ_DB_U32_4_SL_M (((1UL << 2) - 1) << SQ_DB_U32_4_SL_S)
+
+#define SQ_DB_U32_4_PORT_S 18
+#define SQ_DB_U32_4_PORT_M (((1UL << 3) - 1) << SQ_DB_U32_4_PORT_S)
+
+#define SQ_DB_U32_4_DIRECT_WQE_S 31
+
+#define SQ_DB_U32_8_QPN_S 0
+#define SQ_DB_U32_8_QPN_M (((1UL << 24) - 1) << SQ_DB_U32_8_QPN_S)
+
+#define SQ_DB_U32_8_HW_SYNC 31
+
struct hns_roce_cqe {
unsigned int cqe_byte_4;
union {
@@ -160,4 +217,26 @@ struct hns_roce_rc_send_wqe {
unsigned int length1;
};
+struct hns_roce_rc_rq_wqe {
+ unsigned int u32_0;
+ unsigned int sgl_ba_31_0;
+ unsigned int u32_2;
+ unsigned int rvd_5;
+ unsigned int rvd_6;
+ unsigned int rvd_7;
+ unsigned int rvd_8;
+ unsigned int rvd_9;
+
+ uint64_t va0;
+ unsigned int l_key0;
+ unsigned int length0;
+
+ uint64_t va1;
+ unsigned int l_key1;
+ unsigned int length1;
+};
+#define RC_RQ_WQE_NUMBER_OF_DATA_SEG_S 16
+#define RC_RQ_WQE_NUMBER_OF_DATA_SEG_M \
+ (((1UL << 6) - 1) << RC_RQ_WQE_NUMBER_OF_DATA_SEG_S)
+
#endif /* _HNS_ROCE_U_HW_V1_H */
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH rdma-core 5/7] libhns: Add verbs of qp support
From: Lijun Ou @ 2016-10-26 13:04 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the relatived qp verbs for userspace
library of hns, include:
1. create_qp
2. query_qp
3. modify_qp
4. destroy_qp
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
providers/hns/hns_roce_u.c | 5 +
providers/hns/hns_roce_u.h | 52 ++++++++
providers/hns/hns_roce_u_abi.h | 8 ++
providers/hns/hns_roce_u_hw_v1.c | 155 +++++++++++++++++++++++
providers/hns/hns_roce_u_verbs.c | 257 +++++++++++++++++++++++++++++++++++++++
5 files changed, 477 insertions(+)
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index e435bea..30f8678 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -127,6 +127,11 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.cq_event = hns_roce_u_cq_event;
context->ibv_ctx.ops.destroy_cq = hns_roce_u_destroy_cq;
+ context->ibv_ctx.ops.create_qp = hns_roce_u_create_qp;
+ context->ibv_ctx.ops.query_qp = hns_roce_u_query_qp;
+ context->ibv_ctx.ops.modify_qp = hr_dev->u_hw->modify_qp;
+ context->ibv_ctx.ops.destroy_qp = hr_dev->u_hw->destroy_qp;
+
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index a56ca3c..d412e87 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -44,12 +44,20 @@
#define HNS_ROCE_MAX_CQ_NUM 0x10000
#define HNS_ROCE_MIN_CQE_NUM 0x40
+#define HNS_ROCE_MIN_WQE_NUM 0x20
#define HNS_ROCE_CQ_DB_BUF_SIZE ((HNS_ROCE_MAX_CQ_NUM >> 11) << 12)
#define HNS_ROCE_TPTR_OFFSET 0x1000
#define HNS_ROCE_HW_VER1 ('h' << 24 | 'i' << 16 | '0' << 8 | '6')
#define PFX "hns: "
+#ifndef min
+#define min(a, b) \
+ ({ typeof (a) _a = (a); \
+ typeof (b) _b = (b); \
+ _a < _b ? _a : _b; })
+#endif
+
#define roce_get_field(origin, mask, shift) \
(((origin) & (mask)) >> (shift))
@@ -128,10 +136,29 @@ struct hns_roce_cq {
int arm_sn;
};
+struct hns_roce_srq {
+ struct ibv_srq ibv_srq;
+ struct hns_roce_buf buf;
+ pthread_spinlock_t lock;
+ unsigned long *wrid;
+ unsigned int srqn;
+ int max;
+ int max_gs;
+ int wqe_shift;
+ int head;
+ int tail;
+ unsigned int *db;
+ unsigned short counter;
+};
+
struct hns_roce_wq {
unsigned long *wrid;
+ pthread_spinlock_t lock;
int wqe_cnt;
+ int max_post;
+ unsigned int head;
unsigned int tail;
+ int max_gs;
int wqe_shift;
int offset;
};
@@ -139,14 +166,21 @@ struct hns_roce_wq {
struct hns_roce_qp {
struct ibv_qp ibv_qp;
struct hns_roce_buf buf;
+ int max_inline_data;
+ int buf_size;
unsigned int sq_signal_bits;
struct hns_roce_wq sq;
struct hns_roce_wq rq;
+ int port_num;
+ int sl;
};
struct hns_roce_u_hw {
int (*poll_cq)(struct ibv_cq *ibvcq, int ne, struct ibv_wc *wc);
int (*arm_cq)(struct ibv_cq *ibvcq, int solicited);
+ int (*modify_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ int attr_mask);
+ int (*destroy_qp)(struct ibv_qp *ibqp);
};
static inline unsigned long align(unsigned long val, unsigned long align)
@@ -174,6 +208,16 @@ static inline struct hns_roce_cq *to_hr_cq(struct ibv_cq *ibv_cq)
return container_of(ibv_cq, struct hns_roce_cq, ibv_cq);
}
+static inline struct hns_roce_srq *to_hr_srq(struct ibv_srq *ibv_srq)
+{
+ return container_of(ibv_srq, struct hns_roce_srq, ibv_srq);
+}
+
+static inline struct hns_roce_qp *to_hr_qp(struct ibv_qp *ibv_qp)
+{
+ return container_of(ibv_qp, struct hns_roce_qp, ibv_qp);
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr);
int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
@@ -193,10 +237,18 @@ struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
int hns_roce_u_destroy_cq(struct ibv_cq *cq);
void hns_roce_u_cq_event(struct ibv_cq *cq);
+struct ibv_qp *hns_roce_u_create_qp(struct ibv_pd *pd,
+ struct ibv_qp_init_attr *attr);
+
+int hns_roce_u_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr,
+ int attr_mask, struct ibv_qp_init_attr *init_attr);
+
int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size,
int page_size);
void hns_roce_free_buf(struct hns_roce_buf *buf);
+void hns_roce_init_qp_indices(struct hns_roce_qp *qp);
+
extern struct hns_roce_u_hw hns_roce_u_hw_v1;
#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_abi.h b/providers/hns/hns_roce_u_abi.h
index 1e62a7e..e78f967 100644
--- a/providers/hns/hns_roce_u_abi.h
+++ b/providers/hns/hns_roce_u_abi.h
@@ -58,4 +58,12 @@ struct hns_roce_create_cq_resp {
__u32 reserved;
};
+struct hns_roce_create_qp {
+ struct ibv_create_qp ibv_cmd;
+ __u64 buf_addr;
+ __u8 log_sq_bb_count;
+ __u8 log_sq_stride;
+ __u8 reserved[5];
+};
+
#endif /* _HNS_ROCE_U_ABI_H */
diff --git a/providers/hns/hns_roce_u_hw_v1.c b/providers/hns/hns_roce_u_hw_v1.c
index 2676021..fb81634 100644
--- a/providers/hns/hns_roce_u_hw_v1.c
+++ b/providers/hns/hns_roce_u_hw_v1.c
@@ -150,6 +150,16 @@ static struct hns_roce_qp *hns_roce_find_qp(struct hns_roce_context *ctx,
}
}
+static void hns_roce_clear_qp(struct hns_roce_context *ctx, uint32_t qpn)
+{
+ int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift;
+
+ if (!--ctx->qp_table[tind].refcnt)
+ free(ctx->qp_table[tind].table);
+ else
+ ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = NULL;
+}
+
static int hns_roce_v1_poll_one(struct hns_roce_cq *cq,
struct hns_roce_qp **cur_qp, struct ibv_wc *wc)
{
@@ -364,7 +374,152 @@ static int hns_roce_u_v1_arm_cq(struct ibv_cq *ibvcq, int solicited)
return 0;
}
+static void __hns_roce_v1_cq_clean(struct hns_roce_cq *cq, uint32_t qpn,
+ struct hns_roce_srq *srq)
+{
+ int nfreed = 0;
+ uint32_t prod_index;
+ uint8_t owner_bit = 0;
+ struct hns_roce_cqe *cqe, *dest;
+ struct hns_roce_context *ctx = to_hr_ctx(cq->ibv_cq.context);
+
+ for (prod_index = cq->cons_index; get_sw_cqe(cq, prod_index);
+ ++prod_index)
+ if (prod_index == cq->cons_index + cq->ibv_cq.cqe)
+ break;
+
+ while ((int) --prod_index - (int) cq->cons_index >= 0) {
+ cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe);
+ if ((roce_get_field(cqe->cqe_byte_16, CQE_BYTE_16_LOCAL_QPN_M,
+ CQE_BYTE_16_LOCAL_QPN_S) & 0xffffff) == qpn) {
+ ++nfreed;
+ } else if (nfreed) {
+ dest = get_cqe(cq,
+ (prod_index + nfreed) & cq->ibv_cq.cqe);
+ owner_bit = roce_get_bit(dest->cqe_byte_4,
+ CQE_BYTE_4_OWNER_S);
+ memcpy(dest, cqe, sizeof(*cqe));
+ roce_set_bit(dest->cqe_byte_4, CQE_BYTE_4_OWNER_S,
+ owner_bit);
+ }
+ }
+
+ if (nfreed) {
+ cq->cons_index += nfreed;
+ wmb();
+ hns_roce_update_cq_cons_index(ctx, cq);
+ }
+}
+
+static void hns_roce_v1_cq_clean(struct hns_roce_cq *cq, unsigned int qpn,
+ struct hns_roce_srq *srq)
+{
+ pthread_spin_lock(&cq->lock);
+ __hns_roce_v1_cq_clean(cq, qpn, srq);
+ pthread_spin_unlock(&cq->lock);
+}
+
+static int hns_roce_u_v1_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
+ int attr_mask)
+{
+ int ret;
+ struct ibv_modify_qp cmd;
+ struct hns_roce_qp *hr_qp = to_hr_qp(qp);
+
+ ret = ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof(cmd));
+
+ if (!ret && (attr_mask & IBV_QP_STATE) &&
+ attr->qp_state == IBV_QPS_RESET) {
+ hns_roce_v1_cq_clean(to_hr_cq(qp->recv_cq), qp->qp_num,
+ qp->srq ? to_hr_srq(qp->srq) : NULL);
+ if (qp->send_cq != qp->recv_cq)
+ hns_roce_v1_cq_clean(to_hr_cq(qp->send_cq), qp->qp_num,
+ NULL);
+
+ hns_roce_init_qp_indices(to_hr_qp(qp));
+ }
+
+ if (!ret && (attr_mask & IBV_QP_PORT)) {
+ hr_qp->port_num = attr->port_num;
+ printf("hr_qp->port_num= 0x%x\n", hr_qp->port_num);
+ }
+
+ hr_qp->sl = attr->ah_attr.sl;
+
+ return ret;
+}
+
+static void hns_roce_lock_cqs(struct ibv_qp *qp)
+{
+ struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq);
+ struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq);
+
+ if (send_cq == recv_cq) {
+ pthread_spin_lock(&send_cq->lock);
+ } else if (send_cq->cqn < recv_cq->cqn) {
+ pthread_spin_lock(&send_cq->lock);
+ pthread_spin_lock(&recv_cq->lock);
+ } else {
+ pthread_spin_lock(&recv_cq->lock);
+ pthread_spin_lock(&send_cq->lock);
+ }
+}
+
+static void hns_roce_unlock_cqs(struct ibv_qp *qp)
+{
+ struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq);
+ struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq);
+
+ if (send_cq == recv_cq) {
+ pthread_spin_unlock(&send_cq->lock);
+ } else if (send_cq->cqn < recv_cq->cqn) {
+ pthread_spin_unlock(&recv_cq->lock);
+ pthread_spin_unlock(&send_cq->lock);
+ } else {
+ pthread_spin_unlock(&send_cq->lock);
+ pthread_spin_unlock(&recv_cq->lock);
+ }
+}
+
+static int hns_roce_u_v1_destroy_qp(struct ibv_qp *ibqp)
+{
+ int ret;
+ struct hns_roce_qp *qp = to_hr_qp(ibqp);
+
+ pthread_mutex_lock(&to_hr_ctx(ibqp->context)->qp_table_mutex);
+ ret = ibv_cmd_destroy_qp(ibqp);
+ if (ret) {
+ pthread_mutex_unlock(&to_hr_ctx(ibqp->context)->qp_table_mutex);
+ return ret;
+ }
+
+ hns_roce_lock_cqs(ibqp);
+
+ __hns_roce_v1_cq_clean(to_hr_cq(ibqp->recv_cq), ibqp->qp_num,
+ ibqp->srq ? to_hr_srq(ibqp->srq) : NULL);
+
+ if (ibqp->send_cq != ibqp->recv_cq)
+ __hns_roce_v1_cq_clean(to_hr_cq(ibqp->send_cq), ibqp->qp_num,
+ NULL);
+
+ hns_roce_clear_qp(to_hr_ctx(ibqp->context), ibqp->qp_num);
+
+ hns_roce_unlock_cqs(ibqp);
+ pthread_mutex_unlock(&to_hr_ctx(ibqp->context)->qp_table_mutex);
+
+ free(qp->sq.wrid);
+ if (qp->rq.wqe_cnt)
+ free(qp->rq.wrid);
+
+ hns_roce_free_buf(&qp->buf);
+ free(qp);
+
+ return ret;
+}
+
struct hns_roce_u_hw hns_roce_u_hw_v1 = {
.poll_cq = hns_roce_u_v1_poll_cq,
.arm_cq = hns_roce_u_v1_arm_cq,
+ .modify_qp = hns_roce_u_v1_modify_qp,
+ .destroy_qp = hns_roce_u_v1_destroy_qp,
};
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index 077cddc..2dbc851 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -43,6 +43,14 @@
#include "hns_roce_u_abi.h"
#include "hns_roce_u_hw_v1.h"
+void hns_roce_init_qp_indices(struct hns_roce_qp *qp)
+{
+ qp->sq.head = 0;
+ qp->sq.tail = 0;
+ qp->rq.head = 0;
+ qp->rq.tail = 0;
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr)
{
@@ -163,6 +171,29 @@ static int align_cq_size(int req)
return nent;
}
+static int align_qp_size(int req)
+{
+ int nent;
+
+ for (nent = HNS_ROCE_MIN_WQE_NUM; nent < req; nent <<= 1)
+ ;
+
+ return nent;
+}
+
+static void hns_roce_set_sq_sizes(struct hns_roce_qp *qp,
+ struct ibv_qp_cap *cap, enum ibv_qp_type type)
+{
+ struct hns_roce_context *ctx = to_hr_ctx(qp->ibv_qp.context);
+
+ qp->sq.max_gs = 2;
+ cap->max_send_sge = min(ctx->max_sge, qp->sq.max_gs);
+ qp->sq.max_post = min(ctx->max_qp_wr, qp->sq.wqe_cnt);
+ cap->max_send_wr = qp->sq.max_post;
+ qp->max_inline_data = 32;
+ cap->max_inline_data = qp->max_inline_data;
+}
+
static int hns_roce_verify_cq(int *cqe, struct hns_roce_context *context)
{
if (*cqe < HNS_ROCE_MIN_CQE_NUM) {
@@ -189,6 +220,17 @@ static int hns_roce_alloc_cq_buf(struct hns_roce_device *dev,
return 0;
}
+static void hns_roce_calc_sq_wqe_size(struct ibv_qp_cap *cap,
+ enum ibv_qp_type type,
+ struct hns_roce_qp *qp)
+{
+ int size = sizeof(struct hns_roce_rc_send_wqe);
+
+ for (qp->sq.wqe_shift = 6; 1 << qp->sq.wqe_shift < size;
+ qp->sq.wqe_shift++)
+ ;
+}
+
struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
struct ibv_comp_channel *channel,
int comp_vector)
@@ -266,3 +308,218 @@ int hns_roce_u_destroy_cq(struct ibv_cq *cq)
return ret;
}
+
+static int hns_roce_verify_qp(struct ibv_qp_init_attr *attr,
+ struct hns_roce_context *context)
+{
+ if (attr->cap.max_send_wr < HNS_ROCE_MIN_WQE_NUM) {
+ fprintf(stderr,
+ "max_send_wr = %d, less than minimum WQE number.\n",
+ attr->cap.max_send_wr);
+ attr->cap.max_send_wr = HNS_ROCE_MIN_WQE_NUM;
+ }
+
+ if (attr->cap.max_recv_wr < HNS_ROCE_MIN_WQE_NUM) {
+ fprintf(stderr,
+ "max_recv_wr = %d, less than minimum WQE number.\n",
+ attr->cap.max_recv_wr);
+ attr->cap.max_recv_wr = HNS_ROCE_MIN_WQE_NUM;
+ }
+
+ if (attr->cap.max_recv_sge < 1)
+ attr->cap.max_recv_sge = 1;
+ if (attr->cap.max_send_wr > context->max_qp_wr ||
+ attr->cap.max_recv_wr > context->max_qp_wr ||
+ attr->cap.max_send_sge > context->max_sge ||
+ attr->cap.max_recv_sge > context->max_sge)
+ return -1;
+
+ if ((attr->qp_type != IBV_QPT_RC) && (attr->qp_type != IBV_QPT_UD))
+ return -1;
+
+ if ((attr->qp_type == IBV_QPT_RC) &&
+ (attr->cap.max_inline_data > HNS_ROCE_RC_WQE_INLINE_DATA_MAX_LEN))
+ return -1;
+
+ if (attr->qp_type == IBV_QPT_UC)
+ return -1;
+
+ return 0;
+}
+
+static int hns_roce_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap,
+ enum ibv_qp_type type, struct hns_roce_qp *qp)
+{
+ qp->sq.wrid =
+ (unsigned long *)malloc(qp->sq.wqe_cnt * sizeof(uint64_t));
+ if (!qp->sq.wrid)
+ return -1;
+
+ if (qp->rq.wqe_cnt) {
+ qp->rq.wrid = malloc(qp->rq.wqe_cnt * sizeof(uint64_t));
+ if (!qp->rq.wrid) {
+ free(qp->sq.wrid);
+ return -1;
+ }
+ }
+
+ for (qp->rq.wqe_shift = 4;
+ 1 << qp->rq.wqe_shift < sizeof(struct hns_roce_rc_send_wqe);
+ qp->rq.wqe_shift++)
+ ;
+
+ qp->buf_size = align((qp->sq.wqe_cnt << qp->sq.wqe_shift), 0x1000) +
+ (qp->rq.wqe_cnt << qp->rq.wqe_shift);
+
+ if (qp->rq.wqe_shift > qp->sq.wqe_shift) {
+ qp->rq.offset = 0;
+ qp->sq.offset = qp->rq.wqe_cnt << qp->rq.wqe_shift;
+ } else {
+ qp->rq.offset = align((qp->sq.wqe_cnt << qp->sq.wqe_shift),
+ 0x1000);
+ qp->sq.offset = 0;
+ }
+
+ if (hns_roce_alloc_buf(&qp->buf, align(qp->buf_size, 0x1000),
+ to_hr_dev(pd->context->device)->page_size)) {
+ free(qp->sq.wrid);
+ free(qp->rq.wrid);
+ return -1;
+ }
+
+ memset(qp->buf.buf, 0, qp->buf_size);
+
+ return 0;
+}
+
+static int hns_roce_store_qp(struct hns_roce_context *ctx, uint32_t qpn,
+ struct hns_roce_qp *qp)
+{
+ int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift;
+
+ if (!ctx->qp_table[tind].refcnt) {
+ ctx->qp_table[tind].table = calloc(ctx->qp_table_mask + 1,
+ sizeof(struct hns_roce_qp *));
+ if (!ctx->qp_table[tind].table)
+ return -1;
+ }
+
+ ++ctx->qp_table[tind].refcnt;
+ ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = qp;
+
+ return 0;
+}
+
+struct ibv_qp *hns_roce_u_create_qp(struct ibv_pd *pd,
+ struct ibv_qp_init_attr *attr)
+{
+ int ret;
+ struct hns_roce_qp *qp = NULL;
+ struct hns_roce_create_qp cmd;
+ struct ibv_create_qp_resp resp;
+ struct hns_roce_context *context = to_hr_ctx(pd->context);
+
+ if (hns_roce_verify_qp(attr, context)) {
+ fprintf(stderr, "hns_roce_verify_sizes failed!\n");
+ return NULL;
+ }
+
+ qp = malloc(sizeof(*qp));
+ if (!qp) {
+ fprintf(stderr, "malloc failed!\n");
+ return NULL;
+ }
+
+ hns_roce_calc_sq_wqe_size(&attr->cap, attr->qp_type, qp);
+ qp->sq.wqe_cnt = align_qp_size(attr->cap.max_send_wr);
+ qp->rq.wqe_cnt = align_qp_size(attr->cap.max_recv_wr);
+
+ if (hns_roce_alloc_qp_buf(pd, &attr->cap, attr->qp_type, qp)) {
+ fprintf(stderr, "hns_roce_alloc_qp_buf failed!\n");
+ goto err;
+ }
+
+ hns_roce_init_qp_indices(qp);
+
+ if (pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE) ||
+ pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) {
+ fprintf(stderr, "pthread_spin_init failed!\n");
+ goto err_free;
+ }
+
+ cmd.buf_addr = (uintptr_t) qp->buf.buf;
+ cmd.log_sq_stride = qp->sq.wqe_shift;
+ for (cmd.log_sq_bb_count = 0; qp->sq.wqe_cnt > 1 << cmd.log_sq_bb_count;
+ ++cmd.log_sq_bb_count)
+ ;
+
+ memset(cmd.reserved, 0, sizeof(cmd.reserved));
+
+ pthread_mutex_lock(&to_hr_ctx(pd->context)->qp_table_mutex);
+
+ ret = ibv_cmd_create_qp(pd, &qp->ibv_qp, attr, &cmd.ibv_cmd,
+ sizeof(cmd), &resp, sizeof(resp));
+ if (ret) {
+ fprintf(stderr, "ibv_cmd_create_qp failed!\n");
+ goto err_rq_db;
+ }
+
+ ret = hns_roce_store_qp(to_hr_ctx(pd->context), qp->ibv_qp.qp_num, qp);
+ if (ret) {
+ fprintf(stderr, "hns_roce_store_qp failed!\n");
+ goto err_destroy;
+ }
+ pthread_mutex_unlock(&to_hr_ctx(pd->context)->qp_table_mutex);
+
+ qp->rq.wqe_cnt = attr->cap.max_recv_wr;
+ qp->rq.max_gs = attr->cap.max_recv_sge;
+
+ /* adjust rq maxima to not exceed reported device maxima */
+ attr->cap.max_recv_wr = min(context->max_qp_wr, attr->cap.max_recv_wr);
+ attr->cap.max_recv_sge = min(context->max_sge, attr->cap.max_recv_sge);
+
+ qp->rq.max_post = attr->cap.max_recv_wr;
+ hns_roce_set_sq_sizes(qp, &attr->cap, attr->qp_type);
+
+ qp->sq_signal_bits = attr->sq_sig_all ? 0 : 1;
+
+ return &qp->ibv_qp;
+
+err_destroy:
+ ibv_cmd_destroy_qp(&qp->ibv_qp);
+
+err_rq_db:
+ pthread_mutex_unlock(&to_hr_ctx(pd->context)->qp_table_mutex);
+
+err_free:
+ free(qp->sq.wrid);
+ if (qp->rq.wqe_cnt)
+ free(qp->rq.wrid);
+ hns_roce_free_buf(&qp->buf);
+
+err:
+ free(qp);
+
+ return NULL;
+}
+
+int hns_roce_u_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr,
+ int attr_mask, struct ibv_qp_init_attr *init_attr)
+{
+ int ret;
+ struct ibv_query_qp cmd;
+ struct hns_roce_qp *qp = to_hr_qp(ibqp);
+
+ ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd,
+ sizeof(cmd));
+ if (ret)
+ return ret;
+
+ init_attr->cap.max_send_wr = qp->sq.max_post;
+ init_attr->cap.max_send_sge = qp->sq.max_gs;
+ init_attr->cap.max_inline_data = qp->max_inline_data;
+
+ attr->cap = init_attr->cap;
+
+ return ret;
+}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH rdma-core 4/7] libhns: Add verbs of cq support
From: Lijun Ou @ 2016-10-26 13:04 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the relatived cq verbs for userspace
of hns, include:
1. create_cq
2. poll_cq
3. req_notify_cq
4. cq_event
5. destroy_cq
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
providers/hns/hns_roce_u.c | 57 +++++-
providers/hns/hns_roce_u.h | 94 ++++++++++
providers/hns/hns_roce_u_abi.h | 13 +-
providers/hns/hns_roce_u_buf.c | 61 +++++++
providers/hns/hns_roce_u_db.h | 54 ++++++
providers/hns/hns_roce_u_hw_v1.c | 370 +++++++++++++++++++++++++++++++++++++++
providers/hns/hns_roce_u_hw_v1.h | 163 +++++++++++++++++
providers/hns/hns_roce_u_verbs.c | 116 ++++++++++++
8 files changed, 922 insertions(+), 6 deletions(-)
create mode 100644 providers/hns/hns_roce_u_buf.c
create mode 100644 providers/hns/hns_roce_u_db.h
create mode 100644 providers/hns/hns_roce_u_hw_v1.c
create mode 100644 providers/hns/hns_roce_u_hw_v1.h
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index 53e2720..e435bea 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -46,15 +46,19 @@
static const struct {
char hid[HID_LEN];
+ void *data;
+ int version;
} acpi_table[] = {
- {"acpi:HISI00D1:"},
- {},
+ {"acpi:HISI00D1:", &hns_roce_u_hw_v1, HNS_ROCE_HW_VER1},
+ {},
};
static const struct {
char compatible[DEV_MATCH_LEN];
+ void *data;
+ int version;
} dt_table[] = {
- {"hisilicon,hns-roce-v1"},
+ {"hisilicon,hns-roce-v1", &hns_roce_u_hw_v1, HNS_ROCE_HW_VER1},
{},
};
@@ -93,6 +97,21 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
goto err_free;
}
+ if (hr_dev->hw_version == HNS_ROCE_HW_VER1) {
+ /*
+ * when vma->vm_pgoff is 1, the cq_tptr_base includes 64K CQ,
+ * a pointer of CQ need 2B size
+ */
+ context->cq_tptr_base = mmap(NULL, HNS_ROCE_CQ_DB_BUF_SIZE,
+ PROT_READ | PROT_WRITE, MAP_SHARED,
+ cmd_fd, HNS_ROCE_TPTR_OFFSET);
+ if (context->cq_tptr_base == MAP_FAILED) {
+ fprintf(stderr,
+ PFX "Warning: Failed to mmap cq_tptr page.\n");
+ goto db_free;
+ }
+ }
+
pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE);
context->ibv_ctx.ops.query_device = hns_roce_u_query_device;
@@ -102,6 +121,12 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.reg_mr = hns_roce_u_reg_mr;
context->ibv_ctx.ops.dereg_mr = hns_roce_u_dereg_mr;
+ context->ibv_ctx.ops.create_cq = hns_roce_u_create_cq;
+ context->ibv_ctx.ops.poll_cq = hr_dev->u_hw->poll_cq;
+ context->ibv_ctx.ops.req_notify_cq = hr_dev->u_hw->arm_cq;
+ context->ibv_ctx.ops.cq_event = hns_roce_u_cq_event;
+ context->ibv_ctx.ops.destroy_cq = hns_roce_u_destroy_cq;
+
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
@@ -112,6 +137,16 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
return &context->ibv_ctx;
tptr_free:
+ if (hr_dev->hw_version == HNS_ROCE_HW_VER1) {
+ if (munmap(context->cq_tptr_base, HNS_ROCE_CQ_DB_BUF_SIZE))
+ fprintf(stderr, PFX "Warning: Munmap tptr failed.\n");
+ context->cq_tptr_base = NULL;
+ }
+
+db_free:
+ munmap(context->uar, to_hr_dev(ibdev)->page_size);
+ context->uar = NULL;
+
err_free:
free(context);
return NULL;
@@ -122,6 +157,8 @@ static void hns_roce_free_context(struct ibv_context *ibctx)
struct hns_roce_context *context = to_hr_ctx(ibctx);
munmap(context->uar, to_hr_dev(ibctx->device)->page_size);
+ if (to_hr_dev(ibctx->device)->hw_version == HNS_ROCE_HW_VER1)
+ munmap(context->cq_tptr_base, HNS_ROCE_CQ_DB_BUF_SIZE);
context->uar = NULL;
@@ -140,18 +177,26 @@ static struct ibv_device *hns_roce_driver_init(const char *uverbs_sys_path,
struct hns_roce_device *dev;
char value[128];
int i;
+ void *u_hw;
+ int hw_version;
if (ibv_read_sysfs_file(uverbs_sys_path, "device/modalias",
value, sizeof(value)) > 0)
for (i = 0; i < sizeof(acpi_table) / sizeof(acpi_table[0]); ++i)
- if (!strcmp(value, acpi_table[i].hid))
+ if (!strcmp(value, acpi_table[i].hid)) {
+ u_hw = acpi_table[i].data;
+ hw_version = acpi_table[i].version;
goto found;
+ }
if (ibv_read_sysfs_file(uverbs_sys_path, "device/of_node/compatible",
value, sizeof(value)) > 0)
for (i = 0; i < sizeof(dt_table) / sizeof(dt_table[0]); ++i)
- if (!strcmp(value, dt_table[i].compatible))
+ if (!strcmp(value, dt_table[i].compatible)) {
+ u_hw = dt_table[i].data;
+ hw_version = dt_table[i].version;
goto found;
+ }
return NULL;
@@ -164,6 +209,8 @@ found:
}
dev->ibv_dev.ops = hns_roce_dev_ops;
+ dev->u_hw = (struct hns_roce_u_hw *)u_hw;
+ dev->hw_version = hw_version;
dev->page_size = sysconf(_SC_PAGESIZE);
return &dev->ibv_dev;
}
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index 8214054..a56ca3c 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -40,18 +40,53 @@
#include <infiniband/verbs.h>
#include <ccan/container_of.h>
+#define HNS_ROCE_CQE_ENTRY_SIZE 0x20
+
+#define HNS_ROCE_MAX_CQ_NUM 0x10000
+#define HNS_ROCE_MIN_CQE_NUM 0x40
+#define HNS_ROCE_CQ_DB_BUF_SIZE ((HNS_ROCE_MAX_CQ_NUM >> 11) << 12)
+#define HNS_ROCE_TPTR_OFFSET 0x1000
#define HNS_ROCE_HW_VER1 ('h' << 24 | 'i' << 16 | '0' << 8 | '6')
#define PFX "hns: "
+#define roce_get_field(origin, mask, shift) \
+ (((origin) & (mask)) >> (shift))
+
+#define roce_get_bit(origin, shift) \
+ roce_get_field((origin), (1ul << (shift)), (shift))
+
+#define roce_set_field(origin, mask, shift, val) \
+ do { \
+ (origin) &= (~(mask)); \
+ (origin) |= (((unsigned int)(val) << (shift)) & (mask)); \
+ } while (0)
+
+#define roce_set_bit(origin, shift, val) \
+ roce_set_field((origin), (1ul << (shift)), (shift), (val))
+
enum {
HNS_ROCE_QP_TABLE_BITS = 8,
HNS_ROCE_QP_TABLE_SIZE = 1 << HNS_ROCE_QP_TABLE_BITS,
};
+/* operation type list */
+enum {
+ /* rq&srq operation */
+ HNS_ROCE_OPCODE_SEND_DATA_RECEIVE = 0x06,
+ HNS_ROCE_OPCODE_RDMA_WITH_IMM_RECEIVE = 0x07,
+};
+
struct hns_roce_device {
struct ibv_device ibv_dev;
int page_size;
+ struct hns_roce_u_hw *u_hw;
+ int hw_version;
+};
+
+struct hns_roce_buf {
+ void *buf;
+ unsigned int length;
};
struct hns_roce_context {
@@ -59,7 +94,10 @@ struct hns_roce_context {
void *uar;
pthread_spinlock_t uar_lock;
+ void *cq_tptr_base;
+
struct {
+ struct hns_roce_qp **table;
int refcnt;
} qp_table[HNS_ROCE_QP_TABLE_SIZE];
@@ -78,6 +116,44 @@ struct hns_roce_pd {
unsigned int pdn;
};
+struct hns_roce_cq {
+ struct ibv_cq ibv_cq;
+ struct hns_roce_buf buf;
+ pthread_spinlock_t lock;
+ unsigned int cqn;
+ unsigned int cq_depth;
+ unsigned int cons_index;
+ unsigned int *set_ci_db;
+ unsigned int *arm_db;
+ int arm_sn;
+};
+
+struct hns_roce_wq {
+ unsigned long *wrid;
+ int wqe_cnt;
+ unsigned int tail;
+ int wqe_shift;
+ int offset;
+};
+
+struct hns_roce_qp {
+ struct ibv_qp ibv_qp;
+ struct hns_roce_buf buf;
+ unsigned int sq_signal_bits;
+ struct hns_roce_wq sq;
+ struct hns_roce_wq rq;
+};
+
+struct hns_roce_u_hw {
+ int (*poll_cq)(struct ibv_cq *ibvcq, int ne, struct ibv_wc *wc);
+ int (*arm_cq)(struct ibv_cq *ibvcq, int solicited);
+};
+
+static inline unsigned long align(unsigned long val, unsigned long align)
+{
+ return (val + align - 1) & ~(align - 1);
+}
+
static inline struct hns_roce_device *to_hr_dev(struct ibv_device *ibv_dev)
{
return container_of(ibv_dev, struct hns_roce_device, ibv_dev);
@@ -93,6 +169,11 @@ static inline struct hns_roce_pd *to_hr_pd(struct ibv_pd *ibv_pd)
return container_of(ibv_pd, struct hns_roce_pd, ibv_pd);
}
+static inline struct hns_roce_cq *to_hr_cq(struct ibv_cq *ibv_cq)
+{
+ return container_of(ibv_cq, struct hns_roce_cq, ibv_cq);
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr);
int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
@@ -105,4 +186,17 @@ struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
int access);
int hns_roce_u_dereg_mr(struct ibv_mr *mr);
+struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector);
+
+int hns_roce_u_destroy_cq(struct ibv_cq *cq);
+void hns_roce_u_cq_event(struct ibv_cq *cq);
+
+int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size,
+ int page_size);
+void hns_roce_free_buf(struct hns_roce_buf *buf);
+
+extern struct hns_roce_u_hw hns_roce_u_hw_v1;
+
#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_abi.h b/providers/hns/hns_roce_u_abi.h
index edd0074..1e62a7e 100644
--- a/providers/hns/hns_roce_u_abi.h
+++ b/providers/hns/hns_roce_u_abi.h
@@ -46,5 +46,16 @@ struct hns_roce_alloc_pd_resp {
__u32 reserved;
};
-#endif /* _HNS_ROCE_U_ABI_H */
+struct hns_roce_create_cq {
+ struct ibv_create_cq ibv_cmd;
+ __u64 buf_addr;
+ __u64 db_addr;
+};
+
+struct hns_roce_create_cq_resp {
+ struct ibv_create_cq_resp ibv_resp;
+ __u32 cqn;
+ __u32 reserved;
+};
+#endif /* _HNS_ROCE_U_ABI_H */
diff --git a/providers/hns/hns_roce_u_buf.c b/providers/hns/hns_roce_u_buf.c
new file mode 100644
index 0000000..f92ea65
--- /dev/null
+++ b/providers/hns/hns_roce_u_buf.c
@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <errno.h>
+#include <sys/mman.h>
+
+#include "hns_roce_u.h"
+
+int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size,
+ int page_size)
+{
+ int ret;
+
+ buf->length = align(size, page_size);
+ buf->buf = mmap(NULL, buf->length, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ if (buf->buf == MAP_FAILED)
+ return errno;
+
+ ret = ibv_dontfork_range(buf->buf, size);
+ if (ret)
+ munmap(buf->buf, buf->length);
+
+ return ret;
+}
+
+void hns_roce_free_buf(struct hns_roce_buf *buf)
+{
+ ibv_dofork_range(buf->buf, buf->length);
+
+ munmap(buf->buf, buf->length);
+}
diff --git a/providers/hns/hns_roce_u_db.h b/providers/hns/hns_roce_u_db.h
new file mode 100644
index 0000000..76d13ce
--- /dev/null
+++ b/providers/hns/hns_roce_u_db.h
@@ -0,0 +1,54 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/types.h>
+
+#include "hns_roce_u.h"
+
+#ifndef _HNS_ROCE_U_DB_H
+#define _HNS_ROCE_U_DB_H
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define HNS_ROCE_PAIR_TO_64(val) ((uint64_t) val[1] << 32 | val[0])
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define HNS_ROCE_PAIR_TO_64(val) ((uint64_t) val[0] << 32 | val[1])
+#else
+#error __BYTE_ORDER not defined
+#endif
+
+static inline void hns_roce_write64(uint32_t val[2],
+ struct hns_roce_context *ctx, int offset)
+{
+ *(volatile uint64_t *) (ctx->uar + offset) = HNS_ROCE_PAIR_TO_64(val);
+}
+
+#endif /* _HNS_ROCE_U_DB_H */
diff --git a/providers/hns/hns_roce_u_hw_v1.c b/providers/hns/hns_roce_u_hw_v1.c
new file mode 100644
index 0000000..2676021
--- /dev/null
+++ b/providers/hns/hns_roce_u_hw_v1.c
@@ -0,0 +1,370 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <malloc.h>
+#include "hns_roce_u_db.h"
+#include "hns_roce_u_hw_v1.h"
+#include "hns_roce_u.h"
+
+static void hns_roce_update_cq_cons_index(struct hns_roce_context *ctx,
+ struct hns_roce_cq *cq)
+{
+ struct hns_roce_cq_db cq_db;
+
+ cq_db.u32_4 = 0;
+ cq_db.u32_8 = 0;
+
+ roce_set_bit(cq_db.u32_8, CQ_DB_U32_8_HW_SYNC_S, 1);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_M, CQ_DB_U32_8_CMD_S, 3);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_MDF_M,
+ CQ_DB_U32_8_CMD_MDF_S, 0);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CQN_M, CQ_DB_U32_8_CQN_S,
+ cq->cqn);
+ roce_set_field(cq_db.u32_4, CQ_DB_U32_4_CONS_IDX_M,
+ CQ_DB_U32_4_CONS_IDX_S,
+ cq->cons_index & ((cq->cq_depth << 1) - 1));
+
+ hns_roce_write64((uint32_t *)&cq_db, ctx, ROCEE_DB_OTHERS_L_0_REG);
+}
+
+static void hns_roce_handle_error_cqe(struct hns_roce_cqe *cqe,
+ struct ibv_wc *wc)
+{
+ switch (roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_M,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_S) &
+ HNS_ROCE_CQE_STATUS_MASK) {
+ fprintf(stderr, PFX "error cqe!\n");
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_LENGTH_ERR:
+ wc->status = IBV_WC_LOC_LEN_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_QP_OP_ERR:
+ wc->status = IBV_WC_LOC_QP_OP_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_PROT_ERR:
+ wc->status = IBV_WC_LOC_PROT_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_WR_FLUSH_ERR:
+ wc->status = IBV_WC_WR_FLUSH_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_MEM_MANAGE_OPERATE_ERR:
+ wc->status = IBV_WC_MW_BIND_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_BAD_RESP_ERR:
+ wc->status = IBV_WC_BAD_RESP_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_LOCAL_ACCESS_ERR:
+ wc->status = IBV_WC_LOC_ACCESS_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR:
+ wc->status = IBV_WC_REM_INV_REQ_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_REMOTE_ACCESS_ERR:
+ wc->status = IBV_WC_REM_ACCESS_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_REMOTE_OP_ERR:
+ wc->status = IBV_WC_REM_OP_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR:
+ wc->status = IBV_WC_RETRY_EXC_ERR;
+ break;
+ case HNS_ROCE_CQE_SYNDROME_RNR_RETRY_EXC_ERR:
+ wc->status = IBV_WC_RNR_RETRY_EXC_ERR;
+ break;
+ default:
+ wc->status = IBV_WC_GENERAL_ERR;
+ break;
+ }
+}
+
+static struct hns_roce_cqe *get_cqe(struct hns_roce_cq *cq, int entry)
+{
+ return cq->buf.buf + entry * HNS_ROCE_CQE_ENTRY_SIZE;
+}
+
+static void *get_sw_cqe(struct hns_roce_cq *cq, int n)
+{
+ struct hns_roce_cqe *cqe = get_cqe(cq, n & cq->ibv_cq.cqe);
+
+ return (!!(roce_get_bit(cqe->cqe_byte_4, CQE_BYTE_4_OWNER_S)) ^
+ !!(n & (cq->ibv_cq.cqe + 1))) ? cqe : NULL;
+}
+
+static struct hns_roce_cqe *next_cqe_sw(struct hns_roce_cq *cq)
+{
+ return get_sw_cqe(cq, cq->cons_index);
+}
+
+static void *get_send_wqe(struct hns_roce_qp *qp, int n)
+{
+ if ((n < 0) || (n > qp->sq.wqe_cnt)) {
+ printf("sq wqe index:%d,sq wqe cnt:%d\r\n", n, qp->sq.wqe_cnt);
+ return NULL;
+ }
+
+ return (void *)((uint64_t)(qp->buf.buf) + qp->sq.offset +
+ (n << qp->sq.wqe_shift));
+}
+
+static struct hns_roce_qp *hns_roce_find_qp(struct hns_roce_context *ctx,
+ uint32_t qpn)
+{
+ int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift;
+
+ if (ctx->qp_table[tind].refcnt) {
+ return ctx->qp_table[tind].table[qpn & ctx->qp_table_mask];
+ } else {
+ printf("hns_roce_find_qp fail!\n");
+ return NULL;
+ }
+}
+
+static int hns_roce_v1_poll_one(struct hns_roce_cq *cq,
+ struct hns_roce_qp **cur_qp, struct ibv_wc *wc)
+{
+ uint32_t qpn;
+ int is_send;
+ uint16_t wqe_ctr;
+ uint32_t local_qpn;
+ struct hns_roce_wq *wq = NULL;
+ struct hns_roce_cqe *cqe = NULL;
+ struct hns_roce_wqe_ctrl_seg *sq_wqe = NULL;
+
+ /* According to CI, find the relative cqe */
+ cqe = next_cqe_sw(cq);
+ if (!cqe)
+ return CQ_EMPTY;
+
+ /* Get the next cqe, CI will be added gradually */
+ ++cq->cons_index;
+
+ rmb();
+
+ qpn = roce_get_field(cqe->cqe_byte_16, CQE_BYTE_16_LOCAL_QPN_M,
+ CQE_BYTE_16_LOCAL_QPN_S);
+
+ is_send = (roce_get_bit(cqe->cqe_byte_4, CQE_BYTE_4_SQ_RQ_FLAG_S) ==
+ HNS_ROCE_CQE_IS_SQ);
+
+ local_qpn = roce_get_field(cqe->cqe_byte_16, CQE_BYTE_16_LOCAL_QPN_M,
+ CQE_BYTE_16_LOCAL_QPN_S);
+
+ /* if qp is zero, it will not get the correct qpn */
+ if (!*cur_qp ||
+ (local_qpn & HNS_ROCE_CQE_QPN_MASK) != (*cur_qp)->ibv_qp.qp_num) {
+
+ *cur_qp = hns_roce_find_qp(to_hr_ctx(cq->ibv_cq.context),
+ qpn & 0xffffff);
+ if (!*cur_qp) {
+ fprintf(stderr, PFX "can't find qp!\n");
+ return CQ_POLL_ERR;
+ }
+ }
+ wc->qp_num = qpn & 0xffffff;
+
+ if (is_send) {
+ wq = &(*cur_qp)->sq;
+ /*
+ * if sq_signal_bits is 1, the tail pointer first update to
+ * the wqe corresponding the current cqe
+ */
+ if ((*cur_qp)->sq_signal_bits) {
+ wqe_ctr = (uint16_t)(roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_WQE_INDEX_M,
+ CQE_BYTE_4_WQE_INDEX_S));
+ /*
+ * wq->tail will plus a positive number every time,
+ * when wq->tail exceeds 32b, it is 0 and acc
+ */
+ wq->tail += (wqe_ctr - (uint16_t) wq->tail) &
+ (wq->wqe_cnt - 1);
+ }
+ /* write the wr_id of wq into the wc */
+ wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+ ++wq->tail;
+ } else {
+ wq = &(*cur_qp)->rq;
+ wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+ ++wq->tail;
+ }
+
+ /*
+ * HW maintains wc status, set the err type and directly return, after
+ * generated the incorrect CQE
+ */
+ if (roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_M,
+ CQE_BYTE_4_STATUS_OF_THE_OPERATION_S) != HNS_ROCE_CQE_SUCCESS) {
+ hns_roce_handle_error_cqe(cqe, wc);
+ return CQ_OK;
+ }
+ wc->status = IBV_WC_SUCCESS;
+
+ /*
+ * According to the opcode type of cqe, mark the opcode and other
+ * information of wc
+ */
+ if (is_send) {
+ /* Get opcode and flag before update the tail point for send */
+ sq_wqe = (struct hns_roce_wqe_ctrl_seg *)
+ (uint64_t)get_send_wqe(*cur_qp,
+ roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_WQE_INDEX_M,
+ CQE_BYTE_4_WQE_INDEX_S));
+ switch (sq_wqe->flag & HNS_ROCE_WQE_OPCODE_MASK) {
+ case HNS_ROCE_WQE_OPCODE_SEND:
+ wc->opcode = IBV_WC_SEND;
+ break;
+ case HNS_ROCE_WQE_OPCODE_RDMA_READ:
+ wc->opcode = IBV_WC_RDMA_READ;
+ wc->byte_len = cqe->byte_cnt;
+ break;
+ case HNS_ROCE_WQE_OPCODE_RDMA_WRITE:
+ wc->opcode = IBV_WC_RDMA_WRITE;
+ break;
+ case HNS_ROCE_WQE_OPCODE_BIND_MW2:
+ wc->opcode = IBV_WC_BIND_MW;
+ break;
+ default:
+ wc->status = IBV_WC_GENERAL_ERR;
+ break;
+ }
+ wc->wc_flags = (sq_wqe->flag & HNS_ROCE_WQE_IMM ?
+ IBV_WC_WITH_IMM : 0);
+ } else {
+ /* Get opcode and flag in rq&srq */
+ wc->byte_len = (cqe->byte_cnt);
+
+ switch (roce_get_field(cqe->cqe_byte_4,
+ CQE_BYTE_4_OPERATION_TYPE_M,
+ CQE_BYTE_4_OPERATION_TYPE_S) &
+ HNS_ROCE_CQE_OPCODE_MASK) {
+ case HNS_ROCE_OPCODE_RDMA_WITH_IMM_RECEIVE:
+ wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM;
+ wc->wc_flags = IBV_WC_WITH_IMM;
+ wc->imm_data = cqe->immediate_data;
+ break;
+ case HNS_ROCE_OPCODE_SEND_DATA_RECEIVE:
+ if (roce_get_bit(cqe->cqe_byte_4,
+ CQE_BYTE_4_IMMEDIATE_DATA_FLAG_S)) {
+ wc->opcode = IBV_WC_RECV;
+ wc->wc_flags = IBV_WC_WITH_IMM;
+ wc->imm_data = cqe->immediate_data;
+ } else {
+ wc->opcode = IBV_WC_RECV;
+ wc->wc_flags = 0;
+ }
+ break;
+ default:
+ wc->status = IBV_WC_GENERAL_ERR;
+ break;
+ }
+ }
+
+ return CQ_OK;
+}
+
+static int hns_roce_u_v1_poll_cq(struct ibv_cq *ibvcq, int ne,
+ struct ibv_wc *wc)
+{
+ int npolled;
+ int err = CQ_OK;
+ struct hns_roce_qp *qp = NULL;
+ struct hns_roce_cq *cq = to_hr_cq(ibvcq);
+ struct hns_roce_context *ctx = to_hr_ctx(ibvcq->context);
+ struct hns_roce_device *dev = to_hr_dev(ibvcq->context->device);
+
+ pthread_spin_lock(&cq->lock);
+
+ for (npolled = 0; npolled < ne; ++npolled) {
+ err = hns_roce_v1_poll_one(cq, &qp, wc + npolled);
+ if (err != CQ_OK)
+ break;
+ }
+
+ if (npolled) {
+ if (dev->hw_version == HNS_ROCE_HW_VER1) {
+ *cq->set_ci_db = (unsigned short)(cq->cons_index &
+ ((cq->cq_depth << 1) - 1));
+ mb();
+ }
+
+ hns_roce_update_cq_cons_index(ctx, cq);
+ }
+
+ pthread_spin_unlock(&cq->lock);
+
+ return err == CQ_POLL_ERR ? err : npolled;
+}
+
+/**
+ * hns_roce_u_v1_arm_cq - request completion notification on a CQ
+ * @ibvcq: The completion queue to request notification for.
+ * @solicited: If non-zero, a event will be generated only for
+ * the next solicited CQ entry. If zero, any CQ entry,
+ * solicited or not, will generate an event
+ */
+static int hns_roce_u_v1_arm_cq(struct ibv_cq *ibvcq, int solicited)
+{
+ uint32_t ci;
+ uint32_t solicited_flag;
+ struct hns_roce_cq_db cq_db;
+ struct hns_roce_cq *cq = to_hr_cq(ibvcq);
+
+ ci = cq->cons_index & ((cq->cq_depth << 1) - 1);
+ solicited_flag = solicited ? HNS_ROCE_CQ_DB_REQ_SOL :
+ HNS_ROCE_CQ_DB_REQ_NEXT;
+
+ cq_db.u32_4 = 0;
+ cq_db.u32_8 = 0;
+
+ roce_set_bit(cq_db.u32_8, CQ_DB_U32_8_HW_SYNC_S, 1);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_M, CQ_DB_U32_8_CMD_S, 3);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CMD_MDF_M,
+ CQ_DB_U32_8_CMD_MDF_S, 1);
+ roce_set_bit(cq_db.u32_8, CQ_DB_U32_8_NOTIFY_TYPE_S, solicited_flag);
+ roce_set_field(cq_db.u32_8, CQ_DB_U32_8_CQN_M, CQ_DB_U32_8_CQN_S,
+ cq->cqn);
+ roce_set_field(cq_db.u32_4, CQ_DB_U32_4_CONS_IDX_M,
+ CQ_DB_U32_4_CONS_IDX_S, ci);
+
+ hns_roce_write64((uint32_t *)&cq_db, to_hr_ctx(ibvcq->context),
+ ROCEE_DB_OTHERS_L_0_REG);
+ return 0;
+}
+
+struct hns_roce_u_hw hns_roce_u_hw_v1 = {
+ .poll_cq = hns_roce_u_v1_poll_cq,
+ .arm_cq = hns_roce_u_v1_arm_cq,
+};
diff --git a/providers/hns/hns_roce_u_hw_v1.h b/providers/hns/hns_roce_u_hw_v1.h
new file mode 100644
index 0000000..b249f54
--- /dev/null
+++ b/providers/hns/hns_roce_u_hw_v1.h
@@ -0,0 +1,163 @@
+/*
+ * Copyright (c) 2016 Hisilicon Limited.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _HNS_ROCE_U_HW_V1_H
+#define _HNS_ROCE_U_HW_V1_H
+
+#define HNS_ROCE_CQ_DB_REQ_SOL 1
+#define HNS_ROCE_CQ_DB_REQ_NEXT 0
+
+#define HNS_ROCE_CQE_IS_SQ 0
+
+#define HNS_ROCE_RC_WQE_INLINE_DATA_MAX_LEN 32
+
+enum {
+ HNS_ROCE_WQE_IMM = 1 << 23,
+ HNS_ROCE_WQE_OPCODE_SEND = 0 << 16,
+ HNS_ROCE_WQE_OPCODE_RDMA_READ = 1 << 16,
+ HNS_ROCE_WQE_OPCODE_RDMA_WRITE = 2 << 16,
+ HNS_ROCE_WQE_OPCODE_BIND_MW2 = 6 << 16,
+ HNS_ROCE_WQE_OPCODE_MASK = 15 << 16,
+};
+
+struct hns_roce_wqe_ctrl_seg {
+ __be32 sgl_pa_h;
+ __be32 flag;
+};
+
+enum {
+ CQ_OK = 0,
+ CQ_EMPTY = -1,
+ CQ_POLL_ERR = -2,
+};
+
+enum {
+ HNS_ROCE_CQE_QPN_MASK = 0x3ffff,
+ HNS_ROCE_CQE_STATUS_MASK = 0x1f,
+ HNS_ROCE_CQE_OPCODE_MASK = 0xf,
+};
+
+enum {
+ HNS_ROCE_CQE_SUCCESS,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_LENGTH_ERR,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_QP_OP_ERR,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_PROT_ERR,
+ HNS_ROCE_CQE_SYNDROME_WR_FLUSH_ERR,
+ HNS_ROCE_CQE_SYNDROME_MEM_MANAGE_OPERATE_ERR,
+ HNS_ROCE_CQE_SYNDROME_BAD_RESP_ERR,
+ HNS_ROCE_CQE_SYNDROME_LOCAL_ACCESS_ERR,
+ HNS_ROCE_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR,
+ HNS_ROCE_CQE_SYNDROME_REMOTE_ACCESS_ERR,
+ HNS_ROCE_CQE_SYNDROME_REMOTE_OP_ERR,
+ HNS_ROCE_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR,
+ HNS_ROCE_CQE_SYNDROME_RNR_RETRY_EXC_ERR,
+};
+
+struct hns_roce_cq_db {
+ unsigned int u32_4;
+ unsigned int u32_8;
+};
+#define CQ_DB_U32_4_CONS_IDX_S 0
+#define CQ_DB_U32_4_CONS_IDX_M (((1UL << 16) - 1) << CQ_DB_U32_4_CONS_IDX_S)
+
+#define CQ_DB_U32_8_CQN_S 0
+#define CQ_DB_U32_8_CQN_M (((1UL << 16) - 1) << CQ_DB_U32_8_CQN_S)
+
+#define CQ_DB_U32_8_NOTIFY_TYPE_S 16
+
+#define CQ_DB_U32_8_CMD_MDF_S 24
+#define CQ_DB_U32_8_CMD_MDF_M (((1UL << 4) - 1) << CQ_DB_U32_8_CMD_MDF_S)
+
+#define CQ_DB_U32_8_CMD_S 28
+#define CQ_DB_U32_8_CMD_M (((1UL << 3) - 1) << CQ_DB_U32_8_CMD_S)
+
+#define CQ_DB_U32_8_HW_SYNC_S 31
+
+struct hns_roce_cqe {
+ unsigned int cqe_byte_4;
+ union {
+ unsigned int r_key;
+ unsigned int immediate_data;
+ };
+ unsigned int byte_cnt;
+ unsigned int cqe_byte_16;
+ unsigned int cqe_byte_20;
+ unsigned int s_mac_l;
+ unsigned int cqe_byte_28;
+ unsigned int reserved;
+};
+#define CQE_BYTE_4_OPERATION_TYPE_S 0
+#define CQE_BYTE_4_OPERATION_TYPE_M \
+ (((1UL << 4) - 1) << CQE_BYTE_4_OPERATION_TYPE_S)
+
+#define CQE_BYTE_4_OWNER_S 7
+
+#define CQE_BYTE_4_STATUS_OF_THE_OPERATION_S 8
+#define CQE_BYTE_4_STATUS_OF_THE_OPERATION_M \
+ (((1UL << 5) - 1) << CQE_BYTE_4_STATUS_OF_THE_OPERATION_S)
+
+#define CQE_BYTE_4_SQ_RQ_FLAG_S 14
+
+#define CQE_BYTE_4_IMMEDIATE_DATA_FLAG_S 15
+
+#define CQE_BYTE_4_WQE_INDEX_S 16
+#define CQE_BYTE_4_WQE_INDEX_M (((1UL << 14) - 1) << CQE_BYTE_4_WQE_INDEX_S)
+
+#define CQE_BYTE_16_LOCAL_QPN_S 0
+#define CQE_BYTE_16_LOCAL_QPN_M (((1UL << 24) - 1) << CQE_BYTE_16_LOCAL_QPN_S)
+
+#define ROCEE_DB_SQ_L_0_REG 0x230
+
+#define ROCEE_DB_OTHERS_L_0_REG 0x238
+
+struct hns_roce_rc_send_wqe {
+ unsigned int sgl_ba_31_0;
+ unsigned int u32_1;
+ union {
+ unsigned int r_key;
+ unsigned int immediate_data;
+ };
+ unsigned int msg_length;
+ unsigned int rvd_3;
+ unsigned int rvd_4;
+ unsigned int rvd_5;
+ unsigned int rvd_6;
+ uint64_t va0;
+ unsigned int l_key0;
+ unsigned int length0;
+
+ uint64_t va1;
+ unsigned int l_key1;
+ unsigned int length1;
+};
+
+#endif /* _HNS_ROCE_U_HW_V1_H */
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index 249d1aa..077cddc 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -40,6 +40,8 @@
#include <unistd.h>
#include "hns_roce_u.h"
+#include "hns_roce_u_abi.h"
+#include "hns_roce_u_hw_v1.h"
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr)
@@ -150,3 +152,117 @@ int hns_roce_u_dereg_mr(struct ibv_mr *mr)
return ret;
}
+
+static int align_cq_size(int req)
+{
+ int nent;
+
+ for (nent = HNS_ROCE_MIN_CQE_NUM; nent < req; nent <<= 1)
+ ;
+
+ return nent;
+}
+
+static int hns_roce_verify_cq(int *cqe, struct hns_roce_context *context)
+{
+ if (*cqe < HNS_ROCE_MIN_CQE_NUM) {
+ fprintf(stderr, "cqe = %d, less than minimum CQE number.\n",
+ *cqe);
+ *cqe = HNS_ROCE_MIN_CQE_NUM;
+ }
+
+ if (*cqe > context->max_cqe)
+ return -1;
+
+ return 0;
+}
+
+static int hns_roce_alloc_cq_buf(struct hns_roce_device *dev,
+ struct hns_roce_buf *buf, int nent)
+{
+ if (hns_roce_alloc_buf(buf,
+ align(nent * HNS_ROCE_CQE_ENTRY_SIZE, dev->page_size),
+ dev->page_size))
+ return -1;
+ memset(buf->buf, 0, nent * HNS_ROCE_CQE_ENTRY_SIZE);
+
+ return 0;
+}
+
+struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector)
+{
+ struct hns_roce_create_cq cmd;
+ struct hns_roce_create_cq_resp resp;
+ struct hns_roce_cq *cq;
+ int ret;
+
+ if (hns_roce_verify_cq(&cqe, to_hr_ctx(context)))
+ return NULL;
+
+ cq = malloc(sizeof(*cq));
+ if (!cq)
+ return NULL;
+
+ cq->cons_index = 0;
+
+ if (pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE))
+ goto err;
+
+ cqe = align_cq_size(cqe);
+
+ if (hns_roce_alloc_cq_buf(to_hr_dev(context->device), &cq->buf, cqe))
+ goto err;
+
+ cmd.buf_addr = (uintptr_t) cq->buf.buf;
+
+ ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector,
+ &cq->ibv_cq, &cmd.ibv_cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp));
+ if (ret)
+ goto err_db;
+
+ cq->cqn = resp.cqn;
+ cq->cq_depth = cqe;
+
+ if (to_hr_dev(context->device)->hw_version == HNS_ROCE_HW_VER1)
+ cq->set_ci_db = to_hr_ctx(context)->cq_tptr_base + cq->cqn * 2;
+ else
+ cq->set_ci_db = to_hr_ctx(context)->uar +
+ ROCEE_DB_OTHERS_L_0_REG;
+
+ cq->arm_db = cq->set_ci_db;
+ cq->arm_sn = 1;
+ *(cq->set_ci_db) = 0;
+ *(cq->arm_db) = 0;
+
+ return &cq->ibv_cq;
+
+err_db:
+ hns_roce_free_buf(&cq->buf);
+
+err:
+ free(cq);
+
+ return NULL;
+}
+
+void hns_roce_u_cq_event(struct ibv_cq *cq)
+{
+ to_hr_cq(cq)->arm_sn++;
+}
+
+int hns_roce_u_destroy_cq(struct ibv_cq *cq)
+{
+ int ret;
+
+ ret = ibv_cmd_destroy_cq(cq);
+ if (ret)
+ return ret;
+
+ hns_roce_free_buf(&to_hr_cq(cq)->buf);
+ free(to_hr_cq(cq));
+
+ return ret;
+}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* [PATCH rdma-core 3/7] libhns: Add verbs of pd and mr support
From: Lijun Ou @ 2016-10-26 13:04 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
Cc: linuxarm-hv44wF8Li93QT0dZR+AlfA
In-Reply-To: <1477487048-62256-1-git-send-email-oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
This patch mainly introduces the verbs with pd and mr,
included alloc_pd, dealloc_pd, reg_mr and dereg_mr.
Signed-off-by: Lijun Ou <oulijun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Wei Hu <xavier.huwei-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
providers/hns/hns_roce_u.c | 4 ++
providers/hns/hns_roce_u.h | 18 +++++++++
providers/hns/hns_roce_u_abi.h | 6 +++
providers/hns/hns_roce_u_verbs.c | 79 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 107 insertions(+)
diff --git a/providers/hns/hns_roce_u.c b/providers/hns/hns_roce_u.c
index c0f6fe9..53e2720 100644
--- a/providers/hns/hns_roce_u.c
+++ b/providers/hns/hns_roce_u.c
@@ -97,6 +97,10 @@ static struct ibv_context *hns_roce_alloc_context(struct ibv_device *ibdev,
context->ibv_ctx.ops.query_device = hns_roce_u_query_device;
context->ibv_ctx.ops.query_port = hns_roce_u_query_port;
+ context->ibv_ctx.ops.alloc_pd = hns_roce_u_alloc_pd;
+ context->ibv_ctx.ops.dealloc_pd = hns_roce_u_free_pd;
+ context->ibv_ctx.ops.reg_mr = hns_roce_u_reg_mr;
+ context->ibv_ctx.ops.dereg_mr = hns_roce_u_dereg_mr;
if (hns_roce_u_query_device(&context->ibv_ctx, &dev_attrs))
goto tptr_free;
diff --git a/providers/hns/hns_roce_u.h b/providers/hns/hns_roce_u.h
index 0703d1c..8214054 100644
--- a/providers/hns/hns_roce_u.h
+++ b/providers/hns/hns_roce_u.h
@@ -73,6 +73,11 @@ struct hns_roce_context {
int max_cqe;
};
+struct hns_roce_pd {
+ struct ibv_pd ibv_pd;
+ unsigned int pdn;
+};
+
static inline struct hns_roce_device *to_hr_dev(struct ibv_device *ibv_dev)
{
return container_of(ibv_dev, struct hns_roce_device, ibv_dev);
@@ -83,8 +88,21 @@ static inline struct hns_roce_context *to_hr_ctx(struct ibv_context *ibv_ctx)
return container_of(ibv_ctx, struct hns_roce_context, ibv_ctx);
}
+static inline struct hns_roce_pd *to_hr_pd(struct ibv_pd *ibv_pd)
+{
+ return container_of(ibv_pd, struct hns_roce_pd, ibv_pd);
+}
+
int hns_roce_u_query_device(struct ibv_context *context,
struct ibv_device_attr *attr);
int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
struct ibv_port_attr *attr);
+
+struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context);
+int hns_roce_u_free_pd(struct ibv_pd *pd);
+
+struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
+ int access);
+int hns_roce_u_dereg_mr(struct ibv_mr *mr);
+
#endif /* _HNS_ROCE_U_H */
diff --git a/providers/hns/hns_roce_u_abi.h b/providers/hns/hns_roce_u_abi.h
index b9e31b5..edd0074 100644
--- a/providers/hns/hns_roce_u_abi.h
+++ b/providers/hns/hns_roce_u_abi.h
@@ -40,5 +40,11 @@ struct hns_roce_alloc_ucontext_resp {
__u32 qp_tab_size;
};
+struct hns_roce_alloc_pd_resp {
+ struct ibv_alloc_pd_resp ibv_resp;
+ __u32 pdn;
+ __u32 reserved;
+};
+
#endif /* _HNS_ROCE_U_ABI_H */
diff --git a/providers/hns/hns_roce_u_verbs.c b/providers/hns/hns_roce_u_verbs.c
index be55fe8..249d1aa 100644
--- a/providers/hns/hns_roce_u_verbs.c
+++ b/providers/hns/hns_roce_u_verbs.c
@@ -71,3 +71,82 @@ int hns_roce_u_query_port(struct ibv_context *context, uint8_t port,
return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd));
}
+
+struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context)
+{
+ struct ibv_alloc_pd cmd;
+ struct hns_roce_pd *pd;
+ struct hns_roce_alloc_pd_resp resp;
+
+ pd = (struct hns_roce_pd *)malloc(sizeof(*pd));
+ if (!pd)
+ return NULL;
+
+ if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd),
+ &resp.ibv_resp, sizeof(resp))) {
+ free(pd);
+ return NULL;
+ }
+
+ pd->pdn = resp.pdn;
+
+ return &pd->ibv_pd;
+}
+
+int hns_roce_u_free_pd(struct ibv_pd *pd)
+{
+ int ret;
+
+ ret = ibv_cmd_dealloc_pd(pd);
+ if (ret)
+ return ret;
+
+ free(to_hr_pd(pd));
+
+ return ret;
+}
+
+struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length,
+ int access)
+{
+ int ret;
+ struct ibv_mr *mr;
+ struct ibv_reg_mr cmd;
+ struct ibv_reg_mr_resp resp;
+
+ if (addr == NULL) {
+ fprintf(stderr, "2nd parm addr is NULL!\n");
+ return NULL;
+ }
+
+ if (length == 0) {
+ fprintf(stderr, "3st parm length is 0!\n");
+ return NULL;
+ }
+
+ mr = malloc(sizeof(*mr));
+ if (mr)
+ return NULL;
+
+ ret = ibv_cmd_reg_mr(pd, addr, length, (uintptr_t) addr, access, mr,
+ &cmd, sizeof(cmd), &resp, sizeof(resp));
+ if (ret) {
+ free(mr);
+ return NULL;
+ }
+
+ return mr;
+}
+
+int hns_roce_u_dereg_mr(struct ibv_mr *mr)
+{
+ int ret;
+
+ ret = ibv_cmd_dereg_mr(mr);
+ if (ret)
+ return ret;
+
+ free(mr);
+
+ return ret;
+}
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox