[parent not found: <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>]
* [PATCH V6 1/9] RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth
[not found] ` <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
@ 2015-07-24 16:18 ` Steve Wise
[not found] ` <20150724161820.25617.63886.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-07-24 16:18 ` [PATCH V6 2/9] mlx4, mlx5, mthca: Expose max_sge_rd correctly Steve Wise
` (3 subsequent siblings)
4 siblings, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
Currently the sg tablesize, which dictates fast register page list
depth to use, does not take into account the limits of the rdma device.
So adjust it once we discover the device fastreg max depth limit. Also
adjust the max_sectors based on the resulting sg tablesize.
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/ulp/iser/iscsi_iser.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 6a594aa..de8730d 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -640,6 +640,15 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
SHOST_DIX_GUARD_CRC);
}
+ /*
+ * Limit the sg_tablesize and max_sectors based on the device
+ * max fastreg page list length.
+ */
+ shost->sg_tablesize = min_t(unsigned short, shost->sg_tablesize,
+ ib_conn->device->dev_attr.max_fast_reg_page_list_len);
+ shost->max_sectors = min_t(unsigned int,
+ 1024, (shost->sg_tablesize * PAGE_SIZE) >> 9);
+
if (iscsi_host_add(shost,
ib_conn->device->ib_device->dma_device)) {
mutex_unlock(&iser_conn->state_mutex);
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 45+ messages in thread* [PATCH V6 2/9] mlx4, mlx5, mthca: Expose max_sge_rd correctly
[not found] ` <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-07-24 16:18 ` [PATCH V6 1/9] RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth Steve Wise
@ 2015-07-24 16:18 ` Steve Wise
2015-07-24 16:18 ` [PATCH V6 3/9] ipath,qib: " Steve Wise
` (2 subsequent siblings)
4 siblings, 0 replies; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
From: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Applications must not assume that max_sge and max_sge_rd are the same,
Hence expose max_sge_rd correctly as well.
Reported-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Signed-off-by: Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx4/main.c | 1 +
drivers/infiniband/hw/mlx5/main.c | 1 +
drivers/infiniband/hw/mthca/mthca_provider.c | 1 +
3 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 8be6db8..05166b7 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -229,6 +229,7 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
props->max_qp_wr = dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE;
props->max_sge = min(dev->dev->caps.max_sq_sg,
dev->dev->caps.max_rq_sg);
+ props->max_sge_rd = props->max_sge;
props->max_cq = dev->dev->quotas.cq;
props->max_cqe = dev->dev->caps.max_cqes;
props->max_mr = dev->dev->quotas.mpt;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 085c24b..bf27f21 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -273,6 +273,7 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
sizeof(struct mlx5_wqe_ctrl_seg)) /
sizeof(struct mlx5_wqe_data_seg);
props->max_sge = min(max_rq_sg, max_sq_sg);
+ props->max_sge_rd = props->max_sge;
props->max_cq = 1 << MLX5_CAP_GEN(mdev, log_max_cq);
props->max_cqe = (1 << MLX5_CAP_GEN(mdev, log_max_eq_sz)) - 1;
props->max_mr = 1 << MLX5_CAP_GEN(mdev, log_max_mkey);
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 93ae51d..dc2d48c 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -97,6 +97,7 @@ static int mthca_query_device(struct ib_device *ibdev, struct ib_device_attr *pr
props->max_qp = mdev->limits.num_qps - mdev->limits.reserved_qps;
props->max_qp_wr = mdev->limits.max_wqes;
props->max_sge = mdev->limits.max_sg;
+ props->max_sge_rd = props->max_sge;
props->max_cq = mdev->limits.num_cqs - mdev->limits.reserved_cqs;
props->max_cqe = mdev->limits.max_cqes;
props->max_mr = mdev->limits.num_mpts - mdev->limits.reserved_mrws;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 45+ messages in thread* [PATCH V6 3/9] ipath,qib: Expose max_sge_rd correctly
[not found] ` <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-07-24 16:18 ` [PATCH V6 1/9] RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth Steve Wise
2015-07-24 16:18 ` [PATCH V6 2/9] mlx4, mlx5, mthca: Expose max_sge_rd correctly Steve Wise
@ 2015-07-24 16:18 ` Steve Wise
2015-07-24 16:18 ` [PATCH V6 4/9] svcrdma: Use max_sge_rd for destination read depths Steve Wise
2015-07-24 16:18 ` [PATCH V6 6/9] isert: Rename IO functions to more descriptive names Steve Wise
4 siblings, 0 replies; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
Applications must not assume that max_sge and max_sge_rd are the same,
Hence expose max_sge_rd correctly as well.
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Acked-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
drivers/infiniband/hw/ipath/ipath_verbs.c | 1 +
drivers/infiniband/hw/qib/qib_verbs.c | 1 +
2 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
index 30ba49c..ed2bbc2 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
@@ -1521,6 +1521,7 @@ static int ipath_query_device(struct ib_device *ibdev, struct ib_device_attr *pr
props->max_qp = ib_ipath_max_qps;
props->max_qp_wr = ib_ipath_max_qp_wrs;
props->max_sge = ib_ipath_max_sges;
+ props->max_sge_rd = ib_ipath_max_sges;
props->max_cq = ib_ipath_max_cqs;
props->max_ah = ib_ipath_max_ahs;
props->max_cqe = ib_ipath_max_cqes;
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index a05d1a3..bc723b5 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1574,6 +1574,7 @@ static int qib_query_device(struct ib_device *ibdev, struct ib_device_attr *prop
props->max_qp = ib_qib_max_qps;
props->max_qp_wr = ib_qib_max_qp_wrs;
props->max_sge = ib_qib_max_sges;
+ props->max_sge_rd = ib_qib_max_sges;
props->max_cq = ib_qib_max_cqs;
props->max_ah = ib_qib_max_ahs;
props->max_cqe = ib_qib_max_cqes;
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 45+ messages in thread* [PATCH V6 4/9] svcrdma: Use max_sge_rd for destination read depths
[not found] ` <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
` (2 preceding siblings ...)
2015-07-24 16:18 ` [PATCH V6 3/9] ipath,qib: " Steve Wise
@ 2015-07-24 16:18 ` Steve Wise
[not found] ` <20150724161837.25617.48584.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-07-24 16:18 ` [PATCH V6 6/9] isert: Rename IO functions to more descriptive names Steve Wise
4 siblings, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
include/linux/sunrpc/svc_rdma.h | 1 +
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 12 +-----------
net/sunrpc/xprtrdma/svc_rdma_transport.c | 4 ++++
3 files changed, 6 insertions(+), 11 deletions(-)
diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h
index cb94ee4..83211bc 100644
--- a/include/linux/sunrpc/svc_rdma.h
+++ b/include/linux/sunrpc/svc_rdma.h
@@ -132,6 +132,7 @@ struct svcxprt_rdma {
struct list_head sc_accept_q; /* Conn. waiting accept */
int sc_ord; /* RDMA read limit */
int sc_max_sge;
+ int sc_max_sge_rd; /* max sge for read target */
int sc_sq_depth; /* Depth of SQ */
atomic_t sc_sq_count; /* Number of SQ WR on queue */
diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 2e1348b..cb51742 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -115,15 +115,6 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
rqstp->rq_arg.tail[0].iov_len = 0;
}
-static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
-{
- if (!rdma_cap_read_multi_sge(xprt->sc_cm_id->device,
- xprt->sc_cm_id->port_num))
- return 1;
- else
- return min_t(int, sge_count, xprt->sc_max_sge);
-}
-
/* Issue an RDMA_READ using the local lkey to map the data sink */
int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
struct svc_rqst *rqstp,
@@ -144,8 +135,7 @@ int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
ctxt->direction = DMA_FROM_DEVICE;
ctxt->read_hdr = head;
- pages_needed =
- min_t(int, pages_needed, rdma_read_max_sge(xprt, pages_needed));
+ pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
for (pno = 0; pno < pages_needed; pno++) {
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index 6b36279..fdc850f 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -872,6 +872,8 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
* capabilities of this particular device */
newxprt->sc_max_sge = min((size_t)devattr.max_sge,
(size_t)RPCSVC_MAXPAGES);
+ newxprt->sc_max_sge_rd = min_t(size_t, devattr.max_sge_rd,
+ RPCSVC_MAXPAGES);
newxprt->sc_max_requests = min((size_t)devattr.max_qp_wr,
(size_t)svcrdma_max_requests);
newxprt->sc_sq_depth = RPCRDMA_SQ_DEPTH_MULT * newxprt->sc_max_requests;
@@ -1046,6 +1048,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
" remote_ip : %pI4\n"
" remote_port : %d\n"
" max_sge : %d\n"
+ " max_sge_rd : %d\n"
" sq_depth : %d\n"
" max_requests : %d\n"
" ord : %d\n",
@@ -1059,6 +1062,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
ntohs(((struct sockaddr_in *)&newxprt->sc_cm_id->
route.addr.dst_addr)->sin_port),
newxprt->sc_max_sge,
+ newxprt->sc_max_sge_rd,
newxprt->sc_sq_depth,
newxprt->sc_max_requests,
newxprt->sc_ord);
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 45+ messages in thread* [PATCH V6 6/9] isert: Rename IO functions to more descriptive names
[not found] ` <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
` (3 preceding siblings ...)
2015-07-24 16:18 ` [PATCH V6 4/9] svcrdma: Use max_sge_rd for destination read depths Steve Wise
@ 2015-07-24 16:18 ` Steve Wise
[not found] ` <20150724161848.25617.26092.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
4 siblings, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford-H+wXaHxf7aLQT0dZR+AlfA
Cc: infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
This is in preparation for adding new FRMR-only IO handlers
for devices that support FRMR and not PI.
Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
---
drivers/infiniband/ulp/isert/ib_isert.c | 28 ++++++++++++++--------------
1 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 6af3dd4..dcd3c55 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -48,15 +48,15 @@ static struct workqueue_struct *isert_comp_wq;
static struct workqueue_struct *isert_release_wq;
static void
-isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
+isert_unmap_lkey(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
static int
-isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+isert_map_lkey(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
struct isert_rdma_wr *wr);
static void
-isert_unreg_rdma(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
+isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
static int
-isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
- struct isert_rdma_wr *wr);
+isert_reg_frmr_pi(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+ struct isert_rdma_wr *wr);
static int
isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd);
static int
@@ -367,12 +367,12 @@ isert_create_device_ib_res(struct isert_device *device)
if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
device->use_fastreg = 1;
- device->reg_rdma_mem = isert_reg_rdma;
- device->unreg_rdma_mem = isert_unreg_rdma;
+ device->reg_rdma_mem = isert_reg_frmr_pi;
+ device->unreg_rdma_mem = isert_unreg_frmr_pi;
} else {
device->use_fastreg = 0;
- device->reg_rdma_mem = isert_map_rdma;
- device->unreg_rdma_mem = isert_unmap_cmd;
+ device->reg_rdma_mem = isert_map_lkey;
+ device->unreg_rdma_mem = isert_unmap_lkey;
}
ret = isert_alloc_comps(device, dev_attr);
@@ -1701,7 +1701,7 @@ isert_unmap_data_buf(struct isert_conn *isert_conn, struct isert_data_buf *data)
static void
-isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
+isert_unmap_lkey(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
{
struct isert_rdma_wr *wr = &isert_cmd->rdma_wr;
@@ -1726,7 +1726,7 @@ isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
}
static void
-isert_unreg_rdma(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
+isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
{
struct isert_rdma_wr *wr = &isert_cmd->rdma_wr;
@@ -2442,7 +2442,7 @@ isert_build_rdma_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd,
}
static int
-isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+isert_map_lkey(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
struct isert_rdma_wr *wr)
{
struct se_cmd *se_cmd = &cmd->se_cmd;
@@ -2848,8 +2848,8 @@ unmap_prot_cmd:
}
static int
-isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
- struct isert_rdma_wr *wr)
+isert_reg_frmr_pi(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+ struct isert_rdma_wr *wr)
{
struct se_cmd *se_cmd = &cmd->se_cmd;
struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH V6 5/9] RDMA/isert: Limit read depth based on the device max_sge_rd capability
2015-07-24 16:18 [PATCH V6 0/9] iSER support for iWARP Steve Wise
[not found] ` <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
@ 2015-07-24 16:18 ` Steve Wise
2015-07-24 16:18 ` [PATCH V6 7/9] isert: Use the device's max fastreg page list depth Steve Wise
` (2 subsequent siblings)
4 siblings, 0 replies; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford
Cc: infinipath, sagig, ogerlitz, roid, linux-rdma, eli, target-devel,
linux-nfs, bfields
Use the device's max_sge_rd capability to compute the target's read sge
depth. Save both the read and write max_sge values in the isert_conn
struct, and use these when creating RDMA_WRITE/READ work requests.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 24 ++++++++++++++++--------
drivers/infiniband/ulp/isert/ib_isert.h | 3 ++-
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 7717009..6af3dd4 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -165,7 +165,9 @@ isert_create_qp(struct isert_conn *isert_conn,
* outgoing control PDU responses.
*/
attr.cap.max_send_sge = max(2, device->dev_attr.max_sge - 2);
- isert_conn->max_sge = attr.cap.max_send_sge;
+ isert_conn->max_write_sge = attr.cap.max_send_sge;
+ isert_conn->max_read_sge = min_t(u32, device->dev_attr.max_sge_rd,
+ attr.cap.max_send_sge);
attr.cap.max_recv_sge = 1;
attr.sq_sig_type = IB_SIGNAL_REQ_WR;
@@ -2392,7 +2394,7 @@ isert_put_text_rsp(struct iscsi_cmd *cmd, struct iscsi_conn *conn)
static int
isert_build_rdma_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd,
struct ib_sge *ib_sge, struct ib_send_wr *send_wr,
- u32 data_left, u32 offset)
+ u32 data_left, u32 offset, u32 max_sge)
{
struct iscsi_cmd *cmd = isert_cmd->iscsi_cmd;
struct scatterlist *sg_start, *tmp_sg;
@@ -2403,7 +2405,7 @@ isert_build_rdma_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd,
sg_off = offset / PAGE_SIZE;
sg_start = &cmd->se_cmd.t_data_sg[sg_off];
- sg_nents = min(cmd->se_cmd.t_data_nents - sg_off, isert_conn->max_sge);
+ sg_nents = min(cmd->se_cmd.t_data_nents - sg_off, max_sge);
page_off = offset % PAGE_SIZE;
send_wr->sg_list = ib_sge;
@@ -2449,8 +2451,9 @@ isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
struct isert_data_buf *data = &wr->data;
struct ib_send_wr *send_wr;
struct ib_sge *ib_sge;
- u32 offset, data_len, data_left, rdma_write_max, va_offset = 0;
+ u32 offset, data_len, data_left, rdma_max_len, va_offset = 0;
int ret = 0, i, ib_sge_cnt;
+ u32 max_sge;
isert_cmd->tx_desc.isert_cmd = isert_cmd;
@@ -2472,7 +2475,12 @@ isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
}
wr->ib_sge = ib_sge;
- wr->send_wr_num = DIV_ROUND_UP(data->nents, isert_conn->max_sge);
+ if (wr->iser_ib_op == ISER_IB_RDMA_WRITE)
+ max_sge = isert_conn->max_write_sge;
+ else
+ max_sge = isert_conn->max_read_sge;
+
+ wr->send_wr_num = DIV_ROUND_UP(data->nents, max_sge);
wr->send_wr = kzalloc(sizeof(struct ib_send_wr) * wr->send_wr_num,
GFP_KERNEL);
if (!wr->send_wr) {
@@ -2482,11 +2490,11 @@ isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
}
wr->isert_cmd = isert_cmd;
- rdma_write_max = isert_conn->max_sge * PAGE_SIZE;
+ rdma_max_len = max_sge * PAGE_SIZE;
for (i = 0; i < wr->send_wr_num; i++) {
send_wr = &isert_cmd->rdma_wr.send_wr[i];
- data_len = min(data_left, rdma_write_max);
+ data_len = min(data_left, rdma_max_len);
send_wr->send_flags = 0;
if (wr->iser_ib_op == ISER_IB_RDMA_WRITE) {
@@ -2508,7 +2516,7 @@ isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
}
ib_sge_cnt = isert_build_rdma_wr(isert_conn, isert_cmd, ib_sge,
- send_wr, data_len, offset);
+ send_wr, data_len, offset, max_sge);
ib_sge += ib_sge_cnt;
offset += data_len;
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index 9ec23a7..29fde27 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -152,7 +152,8 @@ struct isert_conn {
u32 responder_resources;
u32 initiator_depth;
bool pi_support;
- u32 max_sge;
+ u32 max_write_sge;
+ u32 max_read_sge;
char *login_buf;
char *login_req_buf;
char *login_rsp_buf;
^ permalink raw reply related [flat|nested] 45+ messages in thread* [PATCH V6 7/9] isert: Use the device's max fastreg page list depth
2015-07-24 16:18 [PATCH V6 0/9] iSER support for iWARP Steve Wise
[not found] ` <20150724161331.25617.8475.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-07-24 16:18 ` [PATCH V6 5/9] RDMA/isert: Limit read depth based on the device max_sge_rd capability Steve Wise
@ 2015-07-24 16:18 ` Steve Wise
2015-07-24 16:18 ` [PATCH V6 8/9] isert: Use local_dma_lkey whenever possible Steve Wise
2015-07-24 16:19 ` [PATCH V6 9/9] isert: Support iWARP transports using FRMRs Steve Wise
4 siblings, 0 replies; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford
Cc: infinipath, sagig, ogerlitz, roid, linux-rdma, eli, target-devel,
linux-nfs, bfields
Use the device's max_fast_reg_page_list_len attr to size the SQ
and FRMR pool.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 28 ++++++++++++++++++++++++----
drivers/infiniband/ulp/isert/ib_isert.h | 1 +
2 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index dcd3c55..8ae9208 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -141,6 +141,14 @@ isert_comp_put(struct isert_comp *comp)
mutex_unlock(&device_list_mutex);
}
+static int isert_max_send_wrs(struct isert_conn *isert_conn)
+{
+ if (isert_conn->max_frpl_len >= ISCSI_ISER_SG_TABLESIZE)
+ return ISERT_QP_MAX_REQ_DTOS;
+ return ISERT_QP_MAX_REQ_DTOS *
+ DIV_ROUND_UP(ISCSI_ISER_SG_TABLESIZE, isert_conn->max_frpl_len);
+}
+
static struct ib_qp *
isert_create_qp(struct isert_conn *isert_conn,
struct isert_comp *comp,
@@ -155,8 +163,6 @@ isert_create_qp(struct isert_conn *isert_conn,
attr.qp_context = isert_conn;
attr.send_cq = comp->cq;
attr.recv_cq = comp->cq;
- attr.cap.max_send_wr = ISERT_QP_MAX_REQ_DTOS;
- attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS + 1;
/*
* FIXME: Use devattr.max_sge - 2 for max_send_sge as
* work-around for RDMA_READs with ConnectX-2.
@@ -168,7 +174,13 @@ isert_create_qp(struct isert_conn *isert_conn,
isert_conn->max_write_sge = attr.cap.max_send_sge;
isert_conn->max_read_sge = min_t(u32, device->dev_attr.max_sge_rd,
attr.cap.max_send_sge);
+ isert_conn->max_frpl_len = device->dev_attr.max_fast_reg_page_list_len;
+ isert_info("max_write_sge %d max_read_sge %d max_frpl_len %d\n",
+ isert_conn->max_write_sge, isert_conn->max_read_sge,
+ isert_conn->max_frpl_len);
+ attr.cap.max_send_wr = isert_max_send_wrs(isert_conn);
+ attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS + 1;
attr.cap.max_recv_sge = 1;
attr.sq_sig_type = IB_SIGNAL_REQ_WR;
attr.qp_type = IB_QPT_RC;
@@ -606,12 +618,20 @@ isert_conn_create_fastreg_pool(struct isert_conn *isert_conn)
struct se_session *se_sess = isert_conn->conn->sess->se_sess;
struct se_node_acl *se_nacl = se_sess->se_node_acl;
int i, ret, tag_num;
+
/*
* Setup the number of FRMRs based upon the number of tags
- * available to session in iscsi_target_locate_portal().
+ * available to session in iscsi_target_locate_portal(),
+ * whether pi is supported, and how many FRMRs are needed to
+ * map the largest possible IO given the rdma device limits.
*/
tag_num = max_t(u32, ISCSIT_MIN_TAGS, se_nacl->queue_depth);
- tag_num = (tag_num * 2) + ISCSIT_EXTRA_TAGS;
+ if (isert_conn->pi_support)
+ tag_num *= 2;
+ if (isert_conn->max_frpl_len < ISCSI_ISER_SG_TABLESIZE)
+ tag_num *= DIV_ROUND_UP(ISCSI_ISER_SG_TABLESIZE,
+ isert_conn->max_frpl_len);
+ tag_num += ISCSIT_EXTRA_TAGS;
isert_conn->fr_pool_size = 0;
for (i = 0; i < tag_num; i++) {
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index 29fde27..11cd662 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -154,6 +154,7 @@ struct isert_conn {
bool pi_support;
u32 max_write_sge;
u32 max_read_sge;
+ u32 max_frpl_len;
char *login_buf;
char *login_req_buf;
char *login_rsp_buf;
^ permalink raw reply related [flat|nested] 45+ messages in thread* [PATCH V6 8/9] isert: Use local_dma_lkey whenever possible.
2015-07-24 16:18 [PATCH V6 0/9] iSER support for iWARP Steve Wise
` (2 preceding siblings ...)
2015-07-24 16:18 ` [PATCH V6 7/9] isert: Use the device's max fastreg page list depth Steve Wise
@ 2015-07-24 16:18 ` Steve Wise
[not found] ` <20150724161859.25617.17286.stgit-PBeJgSbIpn97NCTnQtmixQ@public.gmane.org>
2015-07-24 16:19 ` [PATCH V6 9/9] isert: Support iWARP transports using FRMRs Steve Wise
4 siblings, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:18 UTC (permalink / raw)
To: dledford
Cc: infinipath, sagig, ogerlitz, roid, linux-rdma, eli, target-devel,
linux-nfs, bfields
No need to allocate a dma_mr if the device provides a local_dma_lkey.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 47 ++++++++++++++++++-------------
drivers/infiniband/ulp/isert/ib_isert.h | 1 +
2 files changed, 28 insertions(+), 20 deletions(-)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 8ae9208..47bb790 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -249,7 +249,7 @@ isert_alloc_rx_descriptors(struct isert_conn *isert_conn)
rx_sg = &rx_desc->rx_sg;
rx_sg->addr = rx_desc->dma_addr;
rx_sg->length = ISER_RX_PAYLOAD_SIZE;
- rx_sg->lkey = device->mr->lkey;
+ rx_sg->lkey = device->local_dma_lkey;
}
isert_conn->rx_desc_head = 0;
@@ -375,8 +375,9 @@ isert_create_device_ib_res(struct isert_device *device)
if (ret)
return ret;
- /* asign function handlers */
+ /* assign function handlers */
if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
+ dev_attr->device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY &&
dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
device->use_fastreg = 1;
device->reg_rdma_mem = isert_reg_frmr_pi;
@@ -399,12 +400,17 @@ isert_create_device_ib_res(struct isert_device *device)
goto out_cq;
}
- device->mr = ib_get_dma_mr(device->pd, IB_ACCESS_LOCAL_WRITE);
- if (IS_ERR(device->mr)) {
- ret = PTR_ERR(device->mr);
- isert_err("failed to create dma mr, device %p, ret=%d\n",
- device, ret);
- goto out_mr;
+ if (device->use_fastreg)
+ device->local_dma_lkey = device->ib_device->local_dma_lkey;
+ else {
+ device->mr = ib_get_dma_mr(device->pd, IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(device->mr)) {
+ ret = PTR_ERR(device->mr);
+ isert_err("failed to create dma mr, device %p, ret=%d\n",
+ device, ret);
+ goto out_mr;
+ }
+ device->local_dma_lkey = device->mr->lkey;
}
/* Check signature cap */
@@ -425,7 +431,8 @@ isert_free_device_ib_res(struct isert_device *device)
{
isert_info("device %p\n", device);
- ib_dereg_mr(device->mr);
+ if (!device->use_fastreg)
+ ib_dereg_mr(device->mr);
ib_dealloc_pd(device->pd);
isert_free_comps(device);
}
@@ -1108,8 +1115,8 @@ isert_create_send_desc(struct isert_conn *isert_conn,
tx_desc->num_sge = 1;
tx_desc->isert_cmd = isert_cmd;
- if (tx_desc->tx_sg[0].lkey != device->mr->lkey) {
- tx_desc->tx_sg[0].lkey = device->mr->lkey;
+ if (tx_desc->tx_sg[0].lkey != device->local_dma_lkey) {
+ tx_desc->tx_sg[0].lkey = device->local_dma_lkey;
isert_dbg("tx_desc %p lkey mismatch, fixing\n", tx_desc);
}
}
@@ -1132,7 +1139,7 @@ isert_init_tx_hdrs(struct isert_conn *isert_conn,
tx_desc->dma_addr = dma_addr;
tx_desc->tx_sg[0].addr = tx_desc->dma_addr;
tx_desc->tx_sg[0].length = ISER_HEADERS_LEN;
- tx_desc->tx_sg[0].lkey = device->mr->lkey;
+ tx_desc->tx_sg[0].lkey = device->local_dma_lkey;
isert_dbg("Setup tx_sg[0].addr: 0x%llx length: %u lkey: 0x%x\n",
tx_desc->tx_sg[0].addr, tx_desc->tx_sg[0].length,
@@ -1165,7 +1172,7 @@ isert_rdma_post_recvl(struct isert_conn *isert_conn)
memset(&sge, 0, sizeof(struct ib_sge));
sge.addr = isert_conn->login_req_dma;
sge.length = ISER_RX_LOGIN_SIZE;
- sge.lkey = isert_conn->device->mr->lkey;
+ sge.lkey = isert_conn->device->local_dma_lkey;
isert_dbg("Setup sge: addr: %llx length: %d 0x%08x\n",
sge.addr, sge.length, sge.lkey);
@@ -1215,7 +1222,7 @@ isert_put_login_tx(struct iscsi_conn *conn, struct iscsi_login *login,
tx_dsg->addr = isert_conn->login_rsp_dma;
tx_dsg->length = length;
- tx_dsg->lkey = isert_conn->device->mr->lkey;
+ tx_dsg->lkey = isert_conn->device->local_dma_lkey;
tx_desc->num_sge = 2;
}
if (!login->login_failed) {
@@ -1734,6 +1741,7 @@ isert_unmap_lkey(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
if (wr->send_wr) {
isert_dbg("Cmd %p free send_wr\n", isert_cmd);
+ wr->send_wr_num = 0;
kfree(wr->send_wr);
wr->send_wr = NULL;
}
@@ -1967,7 +1975,6 @@ isert_completion_rdma_read(struct iser_tx_desc *tx_desc,
iscsit_stop_dataout_timer(cmd);
device->unreg_rdma_mem(isert_cmd, isert_conn);
cmd->write_data_done = wr->data.len;
- wr->send_wr_num = 0;
isert_dbg("Cmd: %p RDMA_READ comp calling execute_cmd\n", isert_cmd);
spin_lock_bh(&cmd->istate_lock);
@@ -2232,7 +2239,7 @@ isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
isert_cmd->pdu_buf_len = pdu_len;
tx_dsg->addr = isert_cmd->pdu_buf_dma;
tx_dsg->length = pdu_len;
- tx_dsg->lkey = device->mr->lkey;
+ tx_dsg->lkey = device->local_dma_lkey;
isert_cmd->tx_desc.num_sge = 2;
}
@@ -2360,7 +2367,7 @@ isert_put_reject(struct iscsi_cmd *cmd, struct iscsi_conn *conn)
isert_cmd->pdu_buf_len = ISCSI_HDR_LEN;
tx_dsg->addr = isert_cmd->pdu_buf_dma;
tx_dsg->length = ISCSI_HDR_LEN;
- tx_dsg->lkey = device->mr->lkey;
+ tx_dsg->lkey = device->local_dma_lkey;
isert_cmd->tx_desc.num_sge = 2;
isert_init_send_wr(isert_conn, isert_cmd, send_wr);
@@ -2401,7 +2408,7 @@ isert_put_text_rsp(struct iscsi_cmd *cmd, struct iscsi_conn *conn)
isert_cmd->pdu_buf_len = txt_rsp_len;
tx_dsg->addr = isert_cmd->pdu_buf_dma;
tx_dsg->length = txt_rsp_len;
- tx_dsg->lkey = device->mr->lkey;
+ tx_dsg->lkey = device->local_dma_lkey;
isert_cmd->tx_desc.num_sge = 2;
}
isert_init_send_wr(isert_conn, isert_cmd, send_wr);
@@ -2442,7 +2449,7 @@ isert_build_rdma_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd,
ib_sge->addr = ib_sg_dma_address(ib_dev, tmp_sg) + page_off;
ib_sge->length = min_t(u32, data_left,
ib_sg_dma_len(ib_dev, tmp_sg) - page_off);
- ib_sge->lkey = device->mr->lkey;
+ ib_sge->lkey = device->local_dma_lkey;
isert_dbg("RDMA ib_sge: addr: 0x%llx length: %u lkey: %x\n",
ib_sge->addr, ib_sge->length, ib_sge->lkey);
@@ -2622,7 +2629,7 @@ isert_fast_reg_mr(struct isert_conn *isert_conn,
u32 page_off;
if (mem->dma_nents == 1) {
- sge->lkey = device->mr->lkey;
+ sge->lkey = device->local_dma_lkey;
sge->addr = ib_sg_dma_address(ib_dev, &mem->sg[0]);
sge->length = ib_sg_dma_len(ib_dev, &mem->sg[0]);
isert_dbg("sge: addr: 0x%llx length: %u lkey: %x\n",
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index 11cd662..25a0946 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -212,6 +212,7 @@ struct isert_device {
struct ib_device *ib_device;
struct ib_pd *pd;
struct ib_mr *mr;
+ u32 local_dma_lkey;
struct isert_comp *comps;
int comps_used;
struct list_head dev_node;
^ permalink raw reply related [flat|nested] 45+ messages in thread* [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-24 16:18 [PATCH V6 0/9] iSER support for iWARP Steve Wise
` (3 preceding siblings ...)
2015-07-24 16:18 ` [PATCH V6 8/9] isert: Use local_dma_lkey whenever possible Steve Wise
@ 2015-07-24 16:19 ` Steve Wise
2015-07-24 16:57 ` Jason Gunthorpe
4 siblings, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 16:19 UTC (permalink / raw)
To: dledford
Cc: infinipath, sagig, ogerlitz, roid, linux-rdma, eli, target-devel,
linux-nfs, bfields
Add new register and unregister functions to be used with devices that
support FRMRs, provide a local dma lkey, yet do not support DIF/PI.
isert_reg_frmr() only needs to use FRMRs for RDMA READ since RDMA WRITE
can be handled entirely with the local dma lkey. So for RDMA READ,
it calls isert_reg_read_frmr(). Otherwise is uses the lkey map service
isert_map_lkey() for RDMA WRITEs.
isert_reg_read_frmr() will create a linked list of WR triplets of the
form: INV->FRWR->READ. The number of these triplets is dependent on
the devices fast reg page list length limit.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
drivers/infiniband/ulp/isert/ib_isert.c | 224 ++++++++++++++++++++++++++++++-
1 files changed, 218 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 47bb790..34b3151 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -57,6 +57,11 @@ isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
static int
isert_reg_frmr_pi(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
struct isert_rdma_wr *wr);
+static void
+isert_unreg_frmr(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn);
+static int
+isert_reg_frmr(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+ struct isert_rdma_wr *wr);
static int
isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd);
static int
@@ -368,6 +373,7 @@ static int
isert_create_device_ib_res(struct isert_device *device)
{
struct ib_device_attr *dev_attr;
+ int cap_flags;
int ret;
dev_attr = &device->dev_attr;
@@ -376,12 +382,18 @@ isert_create_device_ib_res(struct isert_device *device)
return ret;
/* assign function handlers */
- if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
- dev_attr->device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY &&
- dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
- device->use_fastreg = 1;
- device->reg_rdma_mem = isert_reg_frmr_pi;
- device->unreg_rdma_mem = isert_unreg_frmr_pi;
+ cap_flags = dev_attr->device_cap_flags;
+ if (cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
+ cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) {
+ if (cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
+ device->use_fastreg = 1;
+ device->reg_rdma_mem = isert_reg_frmr_pi;
+ device->unreg_rdma_mem = isert_unreg_frmr_pi;
+ } else {
+ device->use_fastreg = 1;
+ device->reg_rdma_mem = isert_reg_frmr;
+ device->unreg_rdma_mem = isert_unreg_frmr;
+ }
} else {
device->use_fastreg = 0;
device->reg_rdma_mem = isert_map_lkey;
@@ -1782,6 +1794,50 @@ isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
}
static void
+isert_unreg_frmr(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn)
+{
+ struct isert_rdma_wr *wr = &isert_cmd->rdma_wr;
+
+ if (wr->data.sg) {
+ isert_dbg("Cmd %p unmap_sg op\n", isert_cmd);
+ isert_unmap_data_buf(isert_conn, &wr->data);
+ }
+
+ if (wr->send_wr) {
+ int i;
+
+ isert_dbg("Cmd %p free send_wr wr_num %d\n", isert_cmd,
+ wr->send_wr_num);
+ if (wr->iser_ib_op == ISER_IB_RDMA_READ) {
+ spin_lock_bh(&isert_conn->pool_lock);
+ for (i = 0; i < wr->send_wr_num; i += 3) {
+ struct ib_send_wr *fr_wr;
+ struct fast_reg_descriptor *fr_desc;
+
+ fr_wr = &wr->send_wr[i + 1];
+ fr_desc = (struct fast_reg_descriptor *)
+ (uintptr_t) fr_wr->wr_id;
+ isert_dbg("Cmd %p free fr_desc %p\n", isert_cmd,
+ fr_desc);
+ list_add_tail(&fr_desc->list,
+ &isert_conn->fr_pool);
+ }
+ spin_unlock_bh(&isert_conn->pool_lock);
+ }
+ isert_dbg("Cmd %p free wr->send_wr\n", isert_cmd);
+ kfree(wr->send_wr);
+ wr->send_wr = NULL;
+ wr->send_wr_num = 0;
+ }
+
+ if (wr->ib_sge) {
+ isert_dbg("Cmd %p free ib_sge\n", isert_cmd);
+ kfree(wr->ib_sge);
+ wr->ib_sge = NULL;
+ }
+}
+
+static void
isert_put_cmd(struct isert_cmd *isert_cmd, bool comp_err)
{
struct iscsi_cmd *cmd = isert_cmd->iscsi_cmd;
@@ -2958,6 +3014,162 @@ unmap_cmd:
return ret;
}
+static void
+isert_build_inv_fr_wrs(struct isert_conn *isert_conn,
+ struct isert_cmd *isert_cmd, struct ib_sge *ib_sge,
+ struct ib_send_wr *inv_wr, u32 data_len, u32 offset)
+{
+ struct iscsi_cmd *cmd = isert_cmd->iscsi_cmd;
+ struct scatterlist *sg_start;
+ struct isert_device *device = isert_conn->device;
+ struct ib_device *ib_dev = device->ib_device;
+ struct ib_fast_reg_page_list *frpl;
+ struct fast_reg_descriptor *fr_desc;
+ struct ib_send_wr *fr_wr;
+ u32 sg_off, page_off;
+ int sg_nents;
+ int frpl_len;
+ unsigned long flags;
+
+ sg_off = offset / PAGE_SIZE;
+ sg_start = &cmd->se_cmd.t_data_sg[sg_off];
+ sg_nents = min(cmd->se_cmd.t_data_nents - sg_off,
+ isert_conn->max_frpl_len);
+ page_off = offset % PAGE_SIZE;
+
+ spin_lock_irqsave(&isert_conn->pool_lock, flags);
+ fr_desc = list_first_entry(&isert_conn->fr_pool,
+ struct fast_reg_descriptor, list);
+ list_del(&fr_desc->list);
+ spin_unlock_irqrestore(&isert_conn->pool_lock, flags);
+ fr_desc->ind &= ~ISERT_DATA_KEY_VALID;
+
+ /* Build the INV WR */
+ isert_inv_rkey(inv_wr, fr_desc->data_mr);
+
+ /* Build the FR WR */
+ frpl = fr_desc->data_frpl;
+ frpl_len = isert_map_fr_pagelist(ib_dev, sg_start, sg_nents,
+ &frpl->page_list[0]);
+ fr_wr = inv_wr + 1;
+ memset(fr_wr, 0, sizeof(fr_wr));
+ fr_wr->wr_id = (uintptr_t)fr_desc;
+ fr_wr->opcode = IB_WR_FAST_REG_MR;
+ fr_wr->wr.fast_reg.iova_start = frpl->page_list[0] + page_off;
+ fr_wr->wr.fast_reg.page_list = frpl;
+ fr_wr->wr.fast_reg.page_list_len = frpl_len;
+ fr_wr->wr.fast_reg.page_shift = PAGE_SHIFT;
+ fr_wr->wr.fast_reg.length = data_len;
+ fr_wr->wr.fast_reg.rkey = fr_desc->data_mr->rkey;
+ fr_wr->wr.fast_reg.access_flags = IB_ACCESS_REMOTE_WRITE |
+ IB_ACCESS_LOCAL_WRITE;
+
+ ib_sge->addr = frpl->page_list[0] + page_off;
+ ib_sge->length = data_len;
+ ib_sge->lkey = fr_desc->data_mr->rkey;
+
+ /* Link up the wrs: inv->fr->read */
+ inv_wr->next = fr_wr;
+ fr_wr->next = fr_wr + 1;
+}
+
+static int
+isert_reg_read_frmr(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+ struct isert_rdma_wr *wr)
+{
+ struct se_cmd *se_cmd = &cmd->se_cmd;
+ struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
+ struct isert_conn *isert_conn = conn->context;
+ struct isert_data_buf *data = &wr->data;
+ struct ib_send_wr *send_wr = NULL;
+ struct ib_sge *ib_sge;
+ u32 offset, data_len, data_left, rdma_max_len, va_offset = 0;
+ int ret = 0, i;
+ u32 num_frwrs;
+
+ isert_cmd->tx_desc.isert_cmd = isert_cmd;
+
+ offset = cmd->write_data_done;
+ ret = isert_map_data_buf(isert_conn, isert_cmd, se_cmd->t_data_sg,
+ se_cmd->t_data_nents, se_cmd->data_length,
+ offset, wr->iser_ib_op, &wr->data);
+ if (ret)
+ return ret;
+
+ data_left = data->len;
+ offset = data->offset;
+ num_frwrs = DIV_ROUND_UP(data->nents, isert_conn->max_frpl_len);
+
+ ib_sge = kcalloc(num_frwrs, sizeof(struct ib_sge), GFP_KERNEL);
+ if (!ib_sge) {
+ ret = -ENOMEM;
+ goto unmap_cmd;
+ }
+ wr->ib_sge = ib_sge;
+
+ /* always do INV + FR + READ for now */
+ wr->send_wr_num = 3 * num_frwrs;
+ wr->send_wr = kcalloc(wr->send_wr_num, sizeof(struct ib_send_wr),
+ GFP_KERNEL);
+ if (!wr->send_wr) {
+ ret = -ENOMEM;
+ goto unmap_cmd;
+ }
+ isert_info("num_frwrs %d send_wr_num %d\n", num_frwrs, wr->send_wr_num);
+
+ wr->isert_cmd = isert_cmd;
+ rdma_max_len = isert_conn->max_frpl_len * PAGE_SIZE;
+
+ for (i = 0; i < wr->send_wr_num; i += 3) {
+
+ /*
+ * i = INV, i+1 = FR, i+2 = READ
+ */
+ send_wr = &isert_cmd->rdma_wr.send_wr[i+2];
+ data_len = min(data_left, rdma_max_len);
+ isert_build_inv_fr_wrs(isert_conn, isert_cmd, ib_sge,
+ send_wr - 2, data_len, offset);
+ send_wr->opcode = IB_WR_RDMA_READ;
+ send_wr->wr.rdma.remote_addr = isert_cmd->write_va + va_offset;
+ send_wr->wr.rdma.rkey = isert_cmd->write_stag;
+ send_wr->sg_list = ib_sge;
+ send_wr->num_sge = 1;
+ if (i + 3 == wr->send_wr_num) {
+ send_wr->wr_id = (uintptr_t)&isert_cmd->tx_desc;
+ send_wr->send_flags = IB_SEND_SIGNALED;
+ } else {
+ send_wr->next = send_wr + 1;
+ }
+
+ ib_sge++;
+ offset += data_len;
+ va_offset += data_len;
+ data_left -= data_len;
+ }
+ isert_info("send_wr->send_flags 0x%x\n", send_wr->send_flags);
+
+ return 0;
+unmap_cmd:
+ isert_unmap_data_buf(isert_conn, data);
+
+ return ret;
+}
+
+static int
+isert_reg_frmr(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
+ struct isert_rdma_wr *wr)
+{
+
+ /*
+ * Use the local dma lkey for Writes since it only requires local
+ * access flags. For reads, build up inv+fr+read
+ * wr chains as needed.
+ */
+ if (wr->iser_ib_op == ISER_IB_RDMA_WRITE)
+ return isert_map_lkey(conn, cmd, wr);
+ return isert_reg_read_frmr(conn, cmd, wr);
+}
+
static int
isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd)
{
^ permalink raw reply related [flat|nested] 45+ messages in thread* Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-24 16:19 ` [PATCH V6 9/9] isert: Support iWARP transports using FRMRs Steve Wise
@ 2015-07-24 16:57 ` Jason Gunthorpe
2015-07-24 18:48 ` Steve Wise
0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2015-07-24 16:57 UTC (permalink / raw)
To: Steve Wise
Cc: dledford, infinipath, sagig, ogerlitz, roid, linux-rdma, eli,
target-devel, linux-nfs, bfields
On Fri, Jul 24, 2015 at 11:19:05AM -0500, Steve Wise wrote:
> Add new register and unregister functions to be used with devices that
> support FRMRs, provide a local dma lkey, yet do not support DIF/PI.
>
> isert_reg_frmr() only needs to use FRMRs for RDMA READ since RDMA WRITE
> can be handled entirely with the local dma lkey. So for RDMA READ,
> it calls isert_reg_read_frmr(). Otherwise is uses the lkey map service
> isert_map_lkey() for RDMA WRITEs.
>
> isert_reg_read_frmr() will create a linked list of WR triplets of the
> form: INV->FRWR->READ. The number of these triplets is dependent on
> the devices fast reg page list length limit.
That ordering seems strange, surely it should be
FRWR->READ->INV
And use IB_WR_RDMA_READ_WITH_INV if possible?
ACCESS_REMOTE rkey's should not be left open across the FROM_DEVICE
DMA flush.
> /* assign function handlers */
> - if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
> - dev_attr->device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY &&
> - dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
> - device->use_fastreg = 1;
> - device->reg_rdma_mem = isert_reg_frmr_pi;
> - device->unreg_rdma_mem = isert_unreg_frmr_pi;
> + cap_flags = dev_attr->device_cap_flags;
> + if (cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
> + cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) {
> + if (cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
> + device->use_fastreg = 1;
> + device->reg_rdma_mem = isert_reg_frmr_pi;
> + device->unreg_rdma_mem = isert_unreg_frmr_pi;
> + } else {
> + device->use_fastreg = 1;
> + device->reg_rdma_mem = isert_reg_frmr;
> + device->unreg_rdma_mem = isert_unreg_frmr;
> + }
The use of FRWR for RDMA READ should be iWarp specific, IB shouldn't
pay that overhead. I am expecting to see a cap_rdma_read_rkey or
something in here ?
Jason
^ permalink raw reply [flat|nested] 45+ messages in thread* RE: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-24 16:57 ` Jason Gunthorpe
@ 2015-07-24 18:48 ` Steve Wise
2015-07-24 19:24 ` Jason Gunthorpe
0 siblings, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 18:48 UTC (permalink / raw)
To: 'Jason Gunthorpe'
Cc: dledford, infinipath, sagig, ogerlitz, roid, linux-rdma, eli,
target-devel, linux-nfs, bfields
> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com]
> Sent: Friday, July 24, 2015 11:57 AM
> To: Steve Wise
> Cc: dledford@redhat.com; infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-
> rdma@vger.kernel.org; eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org
> Subject: Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
>
> On Fri, Jul 24, 2015 at 11:19:05AM -0500, Steve Wise wrote:
> > Add new register and unregister functions to be used with devices that
> > support FRMRs, provide a local dma lkey, yet do not support DIF/PI.
> >
> > isert_reg_frmr() only needs to use FRMRs for RDMA READ since RDMA WRITE
> > can be handled entirely with the local dma lkey. So for RDMA READ,
> > it calls isert_reg_read_frmr(). Otherwise is uses the lkey map service
> > isert_map_lkey() for RDMA WRITEs.
> >
> > isert_reg_read_frmr() will create a linked list of WR triplets of the
> > form: INV->FRWR->READ. The number of these triplets is dependent on
> > the devices fast reg page list length limit.
>
> That ordering seems strange, surely it should be
>
> FRWR->READ->INV
>
> And use IB_WR_RDMA_READ_WITH_INV if possible?
>
> ACCESS_REMOTE rkey's should not be left open across the FROM_DEVICE
> DMA flush.
>
You're correct. I was thinking to simplify the IO by always invalidating before re-registering. But it does leave the FRMR
registered and exposes a security hole.
I'll have to rework this.
> > /* assign function handlers */
> > - if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
> > - dev_attr->device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY &&
> > - dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
> > - device->use_fastreg = 1;
> > - device->reg_rdma_mem = isert_reg_frmr_pi;
> > - device->unreg_rdma_mem = isert_unreg_frmr_pi;
> > + cap_flags = dev_attr->device_cap_flags;
> > + if (cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS &&
> > + cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) {
> > + if (cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) {
> > + device->use_fastreg = 1;
> > + device->reg_rdma_mem = isert_reg_frmr_pi;
> > + device->unreg_rdma_mem = isert_unreg_frmr_pi;
> > + } else {
> > + device->use_fastreg = 1;
> > + device->reg_rdma_mem = isert_reg_frmr;
> > + device->unreg_rdma_mem = isert_unreg_frmr;
> > + }
>
> The use of FRWR for RDMA READ should be iWarp specific, IB shouldn't
> pay that overhead. I am expecting to see a cap_rdma_read_rkey or
> something in here ?
>
Ok. But cap_rdma_read_rkey() doesn't really describe the requirement. The requirement is rkey + REMOTE_WRITE. So it is more like
rdma_cap_read_requires_remote_write() which is ugly and too long (but descriptive)...
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-24 18:48 ` Steve Wise
@ 2015-07-24 19:24 ` Jason Gunthorpe
2015-07-24 19:57 ` Steve Wise
[not found] ` <20150724192411.GC26225-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 2 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2015-07-24 19:24 UTC (permalink / raw)
To: Steve Wise
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
On Fri, Jul 24, 2015 at 01:48:09PM -0500, Steve Wise wrote:
> > The use of FRWR for RDMA READ should be iWarp specific, IB shouldn't
> > pay that overhead. I am expecting to see a cap_rdma_read_rkey or
> > something in here ?
>
> Ok. But cap_rdma_read_rkey() doesn't really describe the
> requirement. The requirement is rkey + REMOTE_WRITE. So it is more
> like rdma_cap_read_requires_remote_write() which is ugly and too
> long (but descriptive)...
I don't care much what name you pick, just jam something like this in
the description
If set then RDMA_READ must be performed by mapping the local
buffers through a rkey MR with ACCESS_REMOTE_WRITE enabled.
The rkey of this MR should be passed in as the sg_lists's lkey for
IB_WR_RDMA_READ_WITH_INV.
FRWR should be used to register the buffer in the send queue,
and the read should be issued using IB_WR_RDMA_READ_WITH_INV (xx
can we just implicitly rely on this? Are there any iWarp cards that
support FRWR but not WITH_INV?)
Finally, only a single SGE can be used with RDMA_READ, all scattering
must be accomplished with the MR.
This quite dramatically changes what is an allowed scatter for the
transfer, IB can support arbitary unaligned S/G lists, while this is
now forced into gapless page aligned elements.
Your patch takes care of this? And only impacts IB?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 45+ messages in thread* RE: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-24 19:24 ` Jason Gunthorpe
@ 2015-07-24 19:57 ` Steve Wise
2015-07-24 22:11 ` Steve Wise
[not found] ` <20150724192411.GC26225-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
1 sibling, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 19:57 UTC (permalink / raw)
To: 'Jason Gunthorpe'
Cc: dledford, infinipath, sagig, ogerlitz, roid, linux-rdma, eli,
target-devel, linux-nfs, bfields
> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com]
> Sent: Friday, July 24, 2015 2:24 PM
> To: Steve Wise
> Cc: dledford@redhat.com; infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-
> rdma@vger.kernel.org; eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org
> Subject: Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
>
> On Fri, Jul 24, 2015 at 01:48:09PM -0500, Steve Wise wrote:
> > > The use of FRWR for RDMA READ should be iWarp specific, IB shouldn't
> > > pay that overhead. I am expecting to see a cap_rdma_read_rkey or
> > > something in here ?
> >
> > Ok. But cap_rdma_read_rkey() doesn't really describe the
> > requirement. The requirement is rkey + REMOTE_WRITE. So it is more
> > like rdma_cap_read_requires_remote_write() which is ugly and too
> > long (but descriptive)...
>
> I don't care much what name you pick, just jam something like this in
> the description
>
> If set then RDMA_READ must be performed by mapping the local
> buffers through a rkey MR with ACCESS_REMOTE_WRITE enabled.
> The rkey of this MR should be passed in as the sg_lists's lkey for
> IB_WR_RDMA_READ_WITH_INV.
>
> FRWR should be used to register the buffer in the send queue,
> and the read should be issued using IB_WR_RDMA_READ_WITH_INV (xx
> can we just implicitly rely on this? Are there any iWarp cards that
> support FRWR but not WITH_INV?)
>
No. And iWARP devices must support READ_WITH_INV from my reading of the iWARP verbs spec.
I will add all these comments and make use of READ_WITH_INV.
> Finally, only a single SGE can be used with RDMA_READ, all scattering
> must be accomplished with the MR.
>
> This quite dramatically changes what is an allowed scatter for the
> transfer, IB can support arbitary unaligned S/G lists, while this is
> now forced into gapless page aligned elements.
>
> Your patch takes care of this? And only impacts IB?
Did you mean "only impacts iWARP?"
Yes the patch takes care of sg->fr page list packing. And it uses the existing isert_map_fr_pagelist() to pack the sg into the
fastreg page list. This same routine is used by the IB FRMR/PI reg/unreg routines as well.
None of this impacts isert over IB, assuming the suggested change above to only use the new reg/unreg functions for devices that
need the iwarp style of read. IE IB devices will either use the existing FRMR/PI methods if they support FRMR/LOCAL_DMA_LKEY/PI
(mlx5) or they will use the existing DMA_MR methods (all the rest of the IB devices). IW devices will get my new reg/unreg
methods...
Steve.
^ permalink raw reply [flat|nested] 45+ messages in thread* RE: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-24 19:57 ` Steve Wise
@ 2015-07-24 22:11 ` Steve Wise
2015-07-24 22:38 ` Jason Gunthorpe
0 siblings, 1 reply; 45+ messages in thread
From: Steve Wise @ 2015-07-24 22:11 UTC (permalink / raw)
To: 'Jason Gunthorpe'
Cc: dledford, infinipath, sagig, ogerlitz, roid, linux-rdma, eli,
target-devel, linux-nfs, bfields
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of Steve Wise
> Sent: Friday, July 24, 2015 2:58 PM
> To: 'Jason Gunthorpe'
> Cc: dledford@redhat.com; infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-
> rdma@vger.kernel.org; eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org
> Subject: RE: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
>
>
>
> > -----Original Message-----
> > From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com]
> > Sent: Friday, July 24, 2015 2:24 PM
> > To: Steve Wise
> > Cc: dledford@redhat.com; infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-
> > rdma@vger.kernel.org; eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org
> > Subject: Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
> >
> > On Fri, Jul 24, 2015 at 01:48:09PM -0500, Steve Wise wrote:
> > > > The use of FRWR for RDMA READ should be iWarp specific, IB shouldn't
> > > > pay that overhead. I am expecting to see a cap_rdma_read_rkey or
> > > > something in here ?
> > >
> > > Ok. But cap_rdma_read_rkey() doesn't really describe the
> > > requirement. The requirement is rkey + REMOTE_WRITE. So it is more
> > > like rdma_cap_read_requires_remote_write() which is ugly and too
> > > long (but descriptive)...
> >
> > I don't care much what name you pick, just jam something like this in
> > the description
> >
> > If set then RDMA_READ must be performed by mapping the local
> > buffers through a rkey MR with ACCESS_REMOTE_WRITE enabled.
> > The rkey of this MR should be passed in as the sg_lists's lkey for
> > IB_WR_RDMA_READ_WITH_INV.
> >
> > FRWR should be used to register the buffer in the send queue,
> > and the read should be issued using IB_WR_RDMA_READ_WITH_INV (xx
> > can we just implicitly rely on this? Are there any iWarp cards that
> > support FRWR but not WITH_INV?)
> >
>
> No. And iWARP devices must support READ_WITH_INV from my reading of the iWARP verbs spec.
>
> I will add all these comments and make use of READ_WITH_INV.
>
> > Finally, only a single SGE can be used with RDMA_READ, all scattering
> > must be accomplished with the MR.
> >
> > This quite dramatically changes what is an allowed scatter for the
> > transfer, IB can support arbitary unaligned S/G lists, while this is
> > now forced into gapless page aligned elements.
> >
> > Your patch takes care of this? And only impacts IB?
>
> Did you mean "only impacts iWARP?"
>
> Yes the patch takes care of sg->fr page list packing. And it uses the existing isert_map_fr_pagelist() to pack the sg into the
> fastreg page list. This same routine is used by the IB FRMR/PI reg/unreg routines as well.
>
By the way, just to be clear: If you use a FRWR, you by definition only have one SGE entry as the result of the registration. So
regardless of what a device/protocol can do with the destination SGE of an RDMA READ operation, if you use FRWR to register the
destination region, you need only 1 SGE in the RDMA READ WR.
Stevo.
^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-24 22:11 ` Steve Wise
@ 2015-07-24 22:38 ` Jason Gunthorpe
0 siblings, 0 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2015-07-24 22:38 UTC (permalink / raw)
To: Steve Wise
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
On Fri, Jul 24, 2015 at 05:11:04PM -0500, Steve Wise wrote:
> By the way, just to be clear: If you use a FRWR, you by definition
> only have one SGE entry as the result of the registration. So
> regardless of what a device/protocol can do with the destination SGE
> of an RDMA READ operation, if you use FRWR to register the
> destination region, you need only 1 SGE in the RDMA READ WR.
FRWR's have horrible restrictions on the scatterlist. To realize
arbitrary scatterlists requires multiple MRs, and thus multiple SGE
entries.
The warning is for a reader who might be familiar with IB and think
iWarp simply needs to have a MR wrapped around the memory and assume
gaps/etc can be supported through a s/g list, like IB can.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 45+ messages in thread
[parent not found: <20150724192411.GC26225-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
[not found] ` <20150724192411.GC26225-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-07-26 10:23 ` Sagi Grimberg
2015-07-27 17:07 ` Jason Gunthorpe
0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2015-07-26 10:23 UTC (permalink / raw)
To: Jason Gunthorpe, Steve Wise
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
infinipath-ral2JQCrhuEAvxtiuMwx3w, sagig-VPRAkNaXOzVWk0Htik3J/w,
ogerlitz-VPRAkNaXOzVWk0Htik3J/w, roid-VPRAkNaXOzVWk0Htik3J/w,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, eli-VPRAkNaXOzVWk0Htik3J/w,
target-devel-u79uwXL29TY76Z2rM5mHXA,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, bfields-uC3wQj2KruNg9hUCZPvPmw
On 7/24/2015 10:24 PM, Jason Gunthorpe wrote:
> On Fri, Jul 24, 2015 at 01:48:09PM -0500, Steve Wise wrote:
>>> The use of FRWR for RDMA READ should be iWarp specific, IB shouldn't
>>> pay that overhead. I am expecting to see a cap_rdma_read_rkey or
>>> something in here ?
>>
>> Ok. But cap_rdma_read_rkey() doesn't really describe the
>> requirement. The requirement is rkey + REMOTE_WRITE. So it is more
>> like rdma_cap_read_requires_remote_write() which is ugly and too
>> long (but descriptive)...
>
> I don't care much what name you pick, just jam something like this in
> the description
I think we can just do if (signature || iwarp) use fastreg else
use local_dma_lkey.
>
> If set then RDMA_READ must be performed by mapping the local
> buffers through a rkey MR with ACCESS_REMOTE_WRITE enabled.
> The rkey of this MR should be passed in as the sg_lists's lkey for
> IB_WR_RDMA_READ_WITH_INV.
I think this would be an incremental patch and not as part of iwarp
support.
Question though, wouldn't it be better to do a single RDMA_READ to say
4 registered keys rather than RDMA_READ_WITH_INV for each?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 45+ messages in thread* Re: [PATCH V6 9/9] isert: Support iWARP transports using FRMRs
2015-07-26 10:23 ` Sagi Grimberg
@ 2015-07-27 17:07 ` Jason Gunthorpe
0 siblings, 0 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2015-07-27 17:07 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Steve Wise, dledford, infinipath, sagig, ogerlitz, roid,
linux-rdma, eli, target-devel, linux-nfs, bfields
On Sun, Jul 26, 2015 at 01:23:12PM +0300, Sagi Grimberg wrote:
> Question though, wouldn't it be better to do a single RDMA_READ to say
> 4 registered keys rather than RDMA_READ_WITH_INV for each?
RDMA READ is limted to 1 sg in iWarp.
RDMA_READ_WITH_INV and 1 sg is really the only correct way to drive
iWarp.
Jason
^ permalink raw reply [flat|nested] 45+ messages in thread