* Re: [PATCH 01/05] svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
[not found] ` <12111560022506-git-send-email-tom@opengridcomputing.com>
@ 2008-05-19 18:20 ` J. Bruce Fields
2008-05-20 1:07 ` Tom Tucker
[not found] ` <1211245672.31725.111.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
[not found] ` <1211156002624-git-send-email-tom@opengridcomputing.com>
1 sibling, 2 replies; 11+ messages in thread
From: J. Bruce Fields @ 2008-05-19 18:20 UTC (permalink / raw)
To: Tom Tucker; +Cc: linux-nfs
On Sun, May 18, 2008 at 07:13:17PM -0500, Tom Tucker wrote:
> A RDMA read-list cannot contain more elements than RPCSVC_MAXPAGES or
> it will overflow the DTO context. Verify this when processing the
> protocol header.
>
> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>
> ---
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index 6b16d8c..06ab484 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -306,6 +306,8 @@ static int rdma_read_xdr(struct svcxprt_rdma *xprt,
> ch_sge_ary = (struct chunk_sge *)tmp_ch_ctxt->sge;
>
> svc_rdma_rcl_chunk_counts(ch, &ch_count, &byte_count);
> + if (ch_count > RPCSVC_MAXPAGES)
> + return -EINVAL;
> sge_count = rdma_rcl_to_sge(xprt, rqstp, hdr_ctxt, rmsgp,
> sge, ch_sge_ary,
> ch_count, byte_count);
If the ch_count is just the total number of bytes to be read into this
request, then don't we also need to know at what offset they're going to
be inserted? (Shouldn't there be some check like ch->rc_position +
ch_count > RPCSVC_MAXPAGES ?)
Also, do we verify somewhere (before calling
svc_rdma_rcl_chunk_counts()) that rc_discrim is set on the last chunk?
--b.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 05/05] svcrdma: Add dma map count and WARN_ON
[not found] ` <12111560023250-git-send-email-tom@opengridcomputing.com>
@ 2008-05-19 19:18 ` J. Bruce Fields
2008-05-19 19:27 ` Tom Tucker
0 siblings, 1 reply; 11+ messages in thread
From: J. Bruce Fields @ 2008-05-19 19:18 UTC (permalink / raw)
To: Tom Tucker; +Cc: linux-nfs
On Sun, May 18, 2008 at 07:13:21PM -0500, Tom Tucker wrote:
> Add a dma map count in order to verify that all DMA mapping resources
> have been freed at unmount.
I'm not sure what you mean by "at unmount".
--b.
>
> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>
> ---
> include/linux/sunrpc/svc_rdma.h | 1 +
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 1 +
> net/sunrpc/xprtrdma/svc_rdma_sendto.c | 3 +++
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 5 +++++
> 4 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h
> index 29bfc9b..0a9c431 100644
> --- a/include/linux/sunrpc/svc_rdma.h
> +++ b/include/linux/sunrpc/svc_rdma.h
> @@ -105,6 +105,7 @@ struct svcxprt_rdma {
>
> struct ib_pd *sc_pd;
>
> + atomic_t sc_dma_used;
> atomic_t sc_ctxt_used;
> struct list_head sc_ctxt_free;
> int sc_ctxt_cnt;
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index 174e888..9b3dff0 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -229,6 +229,7 @@ static void rdma_set_ctxt_sge(struct svcxprt_rdma *xprt,
> ctxt->count = count;
> ctxt->direction = DMA_FROM_DEVICE;
> for (i = 0; i < count; i++) {
> + atomic_inc(&xprt->sc_dma_used);
> ctxt->sge[i].addr = (unsigned long)
> ib_dma_map_single(xprt->sc_cm_id->device,
> (void*)sge[i].addr, sge[i].length,
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> index 85931c4..cdb0732 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> @@ -174,6 +174,7 @@ static int send_write(struct svcxprt_rdma *xprt, struct svc_rqst *rqstp,
> sge_bytes = min((size_t)bc,
> (size_t)(xdr_sge[xdr_sge_no].length-sge_off));
> sge[sge_no].length = sge_bytes;
> + atomic_inc(&xprt->sc_dma_used);
> sge[sge_no].addr =
> ib_dma_map_single(xprt->sc_cm_id->device,
> (void *)
> @@ -399,6 +400,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
> ctxt->count = 1;
>
> /* Prepare the SGE for the RPCRDMA Header */
> + atomic_inc(&rdma->sc_dma_used);
> ctxt->sge[0].addr =
> ib_dma_map_page(rdma->sc_cm_id->device,
> page, 0, PAGE_SIZE, DMA_TO_DEVICE);
> @@ -411,6 +413,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
> sge_bytes = min((size_t)ctxt->sge[sge_no].length,
> (size_t)byte_count);
> byte_count -= sge_bytes;
> + atomic_inc(&rdma->sc_dma_used);
> ctxt->sge[sge_no].addr =
> ib_dma_map_single(rdma->sc_cm_id->device,
> (void *)
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> index 8a50586..68908b5 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> @@ -155,6 +155,7 @@ static void svc_rdma_unmap_dma(struct svc_rdma_op_ctxt *ctxt)
> struct svcxprt_rdma *xprt = ctxt->xprt;
> int i;
> for (i = 0; i < ctxt->count && ctxt->sge[i].length; i++) {
> + atomic_dec(&xprt->sc_dma_used);
> ib_dma_unmap_single(xprt->sc_cm_id->device,
> ctxt->sge[i].addr,
> ctxt->sge[i].length,
> @@ -493,6 +494,7 @@ static struct svcxprt_rdma *rdma_create_xprt(struct svc_serv *serv,
> cma_xprt->sc_max_requests = svcrdma_max_requests;
> cma_xprt->sc_sq_depth = svcrdma_max_requests * RPCRDMA_SQ_DEPTH_MULT;
> atomic_set(&cma_xprt->sc_sq_count, 0);
> + atomic_set(&cma_xprt->sc_ctxt_used, 0);
>
> if (!listener) {
> int reqs = cma_xprt->sc_max_requests;
> @@ -543,6 +545,7 @@ int svc_rdma_post_recv(struct svcxprt_rdma *xprt)
> BUG_ON(sge_no >= xprt->sc_max_sge);
> page = svc_rdma_get_page();
> ctxt->pages[sge_no] = page;
> + atomic_inc(&xprt->sc_dma_used);
> pa = ib_dma_map_page(xprt->sc_cm_id->device,
> page, 0, PAGE_SIZE,
> DMA_FROM_DEVICE);
> @@ -1023,6 +1026,7 @@ static void __svc_rdma_free(struct work_struct *work)
>
> /* Warn if we leaked a resource or under-referenced */
> WARN_ON(atomic_read(&rdma->sc_ctxt_used) != 0);
> + WARN_ON(atomic_read(&rdma->sc_dma_used) != 0);
>
> /* Destroy the QP if present (not a listener) */
> if (rdma->sc_qp && !IS_ERR(rdma->sc_qp))
> @@ -1143,6 +1147,7 @@ void svc_rdma_send_error(struct svcxprt_rdma *xprt, struct rpcrdma_msg *rmsgp,
> length = svc_rdma_xdr_encode_error(xprt, rmsgp, err, va);
>
> /* Prepare SGE for local address */
> + atomic_inc(&xprt->sc_dma_used);
> sge.addr = ib_dma_map_page(xprt->sc_cm_id->device,
> p, 0, PAGE_SIZE, DMA_FROM_DEVICE);
> sge.lkey = xprt->sc_phys_mr->lkey;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 05/05] svcrdma: Add dma map count and WARN_ON
2008-05-19 19:18 ` [PATCH 05/05] svcrdma: Add dma map count and WARN_ON J. Bruce Fields
@ 2008-05-19 19:27 ` Tom Tucker
0 siblings, 0 replies; 11+ messages in thread
From: Tom Tucker @ 2008-05-19 19:27 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Mon, 2008-05-19 at 15:18 -0400, J. Bruce Fields wrote:
> On Sun, May 18, 2008 at 07:13:21PM -0500, Tom Tucker wrote:
> > Add a dma map count in order to verify that all DMA mapping resources
> > have been freed at unmount.
>
> I'm not sure what you mean by "at unmount".
It would be clearer to say "when the transport is closed". I was testing
using unmount to cause the close and had unmount on-the-brain.
>
> --b.
>
> >
> > Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> >
> > ---
> > include/linux/sunrpc/svc_rdma.h | 1 +
> > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 1 +
> > net/sunrpc/xprtrdma/svc_rdma_sendto.c | 3 +++
> > net/sunrpc/xprtrdma/svc_rdma_transport.c | 5 +++++
> > 4 files changed, 10 insertions(+), 0 deletions(-)
> >
> > diff --git a/include/linux/sunrpc/svc_rdma.h b/include/linux/sunrpc/svc_rdma.h
> > index 29bfc9b..0a9c431 100644
> > --- a/include/linux/sunrpc/svc_rdma.h
> > +++ b/include/linux/sunrpc/svc_rdma.h
> > @@ -105,6 +105,7 @@ struct svcxprt_rdma {
> >
> > struct ib_pd *sc_pd;
> >
> > + atomic_t sc_dma_used;
> > atomic_t sc_ctxt_used;
> > struct list_head sc_ctxt_free;
> > int sc_ctxt_cnt;
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > index 174e888..9b3dff0 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > @@ -229,6 +229,7 @@ static void rdma_set_ctxt_sge(struct svcxprt_rdma *xprt,
> > ctxt->count = count;
> > ctxt->direction = DMA_FROM_DEVICE;
> > for (i = 0; i < count; i++) {
> > + atomic_inc(&xprt->sc_dma_used);
> > ctxt->sge[i].addr = (unsigned long)
> > ib_dma_map_single(xprt->sc_cm_id->device,
> > (void*)sge[i].addr, sge[i].length,
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > index 85931c4..cdb0732 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > @@ -174,6 +174,7 @@ static int send_write(struct svcxprt_rdma *xprt, struct svc_rqst *rqstp,
> > sge_bytes = min((size_t)bc,
> > (size_t)(xdr_sge[xdr_sge_no].length-sge_off));
> > sge[sge_no].length = sge_bytes;
> > + atomic_inc(&xprt->sc_dma_used);
> > sge[sge_no].addr =
> > ib_dma_map_single(xprt->sc_cm_id->device,
> > (void *)
> > @@ -399,6 +400,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
> > ctxt->count = 1;
> >
> > /* Prepare the SGE for the RPCRDMA Header */
> > + atomic_inc(&rdma->sc_dma_used);
> > ctxt->sge[0].addr =
> > ib_dma_map_page(rdma->sc_cm_id->device,
> > page, 0, PAGE_SIZE, DMA_TO_DEVICE);
> > @@ -411,6 +413,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
> > sge_bytes = min((size_t)ctxt->sge[sge_no].length,
> > (size_t)byte_count);
> > byte_count -= sge_bytes;
> > + atomic_inc(&rdma->sc_dma_used);
> > ctxt->sge[sge_no].addr =
> > ib_dma_map_single(rdma->sc_cm_id->device,
> > (void *)
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> > index 8a50586..68908b5 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> > @@ -155,6 +155,7 @@ static void svc_rdma_unmap_dma(struct svc_rdma_op_ctxt *ctxt)
> > struct svcxprt_rdma *xprt = ctxt->xprt;
> > int i;
> > for (i = 0; i < ctxt->count && ctxt->sge[i].length; i++) {
> > + atomic_dec(&xprt->sc_dma_used);
> > ib_dma_unmap_single(xprt->sc_cm_id->device,
> > ctxt->sge[i].addr,
> > ctxt->sge[i].length,
> > @@ -493,6 +494,7 @@ static struct svcxprt_rdma *rdma_create_xprt(struct svc_serv *serv,
> > cma_xprt->sc_max_requests = svcrdma_max_requests;
> > cma_xprt->sc_sq_depth = svcrdma_max_requests * RPCRDMA_SQ_DEPTH_MULT;
> > atomic_set(&cma_xprt->sc_sq_count, 0);
> > + atomic_set(&cma_xprt->sc_ctxt_used, 0);
> >
> > if (!listener) {
> > int reqs = cma_xprt->sc_max_requests;
> > @@ -543,6 +545,7 @@ int svc_rdma_post_recv(struct svcxprt_rdma *xprt)
> > BUG_ON(sge_no >= xprt->sc_max_sge);
> > page = svc_rdma_get_page();
> > ctxt->pages[sge_no] = page;
> > + atomic_inc(&xprt->sc_dma_used);
> > pa = ib_dma_map_page(xprt->sc_cm_id->device,
> > page, 0, PAGE_SIZE,
> > DMA_FROM_DEVICE);
> > @@ -1023,6 +1026,7 @@ static void __svc_rdma_free(struct work_struct *work)
> >
> > /* Warn if we leaked a resource or under-referenced */
> > WARN_ON(atomic_read(&rdma->sc_ctxt_used) != 0);
> > + WARN_ON(atomic_read(&rdma->sc_dma_used) != 0);
> >
> > /* Destroy the QP if present (not a listener) */
> > if (rdma->sc_qp && !IS_ERR(rdma->sc_qp))
> > @@ -1143,6 +1147,7 @@ void svc_rdma_send_error(struct svcxprt_rdma *xprt, struct rpcrdma_msg *rmsgp,
> > length = svc_rdma_xdr_encode_error(xprt, rmsgp, err, va);
> >
> > /* Prepare SGE for local address */
> > + atomic_inc(&xprt->sc_dma_used);
> > sge.addr = ib_dma_map_page(xprt->sc_cm_id->device,
> > p, 0, PAGE_SIZE, DMA_FROM_DEVICE);
> > sge.lkey = xprt->sc_phys_mr->lkey;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 01/05] svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
2008-05-19 18:20 ` [PATCH 01/05] svcrdma: Verify read-list fits within RPCSVC_MAXPAGES J. Bruce Fields
@ 2008-05-20 1:07 ` Tom Tucker
[not found] ` <1211245672.31725.111.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
1 sibling, 0 replies; 11+ messages in thread
From: Tom Tucker @ 2008-05-20 1:07 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On Mon, 2008-05-19 at 14:20 -0400, J. Bruce Fields wrote:
> On Sun, May 18, 2008 at 07:13:17PM -0500, Tom Tucker wrote:
> > A RDMA read-list cannot contain more elements than RPCSVC_MAXPAGES or
> > it will overflow the DTO context. Verify this when processing the
> > protocol header.
> >
> > Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> >
> > ---
> > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 2 ++
> > 1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > index 6b16d8c..06ab484 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > @@ -306,6 +306,8 @@ static int rdma_read_xdr(struct svcxprt_rdma *xprt,
> > ch_sge_ary = (struct chunk_sge *)tmp_ch_ctxt->sge;
> >
> > svc_rdma_rcl_chunk_counts(ch, &ch_count, &byte_count);
> > + if (ch_count > RPCSVC_MAXPAGES)
> > + return -EINVAL;
> > sge_count = rdma_rcl_to_sge(xprt, rqstp, hdr_ctxt, rmsgp,
> > sge, ch_sge_ary,
> > ch_count, byte_count);
>
> If the ch_count is just the total number of bytes to be read into this
> request, then don't we also need to know at what offset they're going to
> be inserted? (Shouldn't there be some check like ch->rc_position +
> ch_count > RPCSVC_MAXPAGES ?)
>
The ch_count is the number of RPCRDMA chunk elements in the read-list.
It's not a byte count, but a scatter-gather-list length.
I think the local read-list buffer limits should be clamped by
svc_rdma_rcl_chunk_counts, however, see below...
> Also, do we verify somewhere (before calling
> svc_rdma_rcl_chunk_counts()) that rc_discrim is set on the last chunk?
>
No we don't and a Byzantine client could crash us. The computed
byte_count should also be clamped here. I'll add this to the list --
nice catch.
This kind of check along with a bunch of others should go in
svc_rdma_xdr_decode_req. I have these things planned for the 2.6.27
time-frame (along with Fast NSMR support).
Do you think it's more urgent?
Tom
> --b.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 01/05] svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
[not found] ` <1211245672.31725.111.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
@ 2008-05-20 13:27 ` Talpey, Thomas
[not found] ` <RTPCLUEXC1-PRDh133t00000127-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
0 siblings, 1 reply; 11+ messages in thread
From: Talpey, Thomas @ 2008-05-20 13:27 UTC (permalink / raw)
To: Tom Tucker; +Cc: J. Bruce Fields, linux-nfs
At 09:07 PM 5/19/2008, Tom Tucker wrote:
>No we don't and a Byzantine client could crash us.
That can be arranged... :-)
>This kind of check along with a bunch of others should go in
>svc_rdma_xdr_decode_req. I have these things planned for the 2.6.27
>time-frame (along with Fast NSMR support).
>
>Do you think it's more urgent?
MHO is that it's important but not urgent, and it should be part of a later
change. At Connectathon, no clients were sending any problematic requests,
so I think it's unlikely this will affect us in the wild, for now.
Tom.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 01/05] svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
[not found] ` <RTPCLUEXC1-PRDh133t00000127-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
@ 2008-05-20 13:56 ` J. Bruce Fields
2008-05-20 14:14 ` Talpey, Thomas
0 siblings, 1 reply; 11+ messages in thread
From: J. Bruce Fields @ 2008-05-20 13:56 UTC (permalink / raw)
To: Talpey, Thomas; +Cc: Tom Tucker, linux-nfs
On Tue, May 20, 2008 at 09:27:13AM -0400, Talpey, Thomas wrote:
> At 09:07 PM 5/19/2008, Tom Tucker wrote:
> >No we don't and a Byzantine client could crash us.
>
> That can be arranged... :-)
>
> >This kind of check along with a bunch of others should go in
> >svc_rdma_xdr_decode_req. I have these things planned for the 2.6.27
> >time-frame (along with Fast NSMR support).
> >
> >Do you think it's more urgent?
>
> MHO is that it's important but not urgent, and it should be part of a later
> change. At Connectathon, no clients were sending any problematic requests,
> so I think it's unlikely this will affect us in the wild, for now.
Somewhere in the documentation, a really clear warning about the
security assumptions would be useful. It could also help if the howto
(on the web and in Documentation/filesystems/nfs-rdma.txt) included any
instructions on necessary firewalling, etc.
By the way, the Kconfig help text for SUNRPC_XPRT_RDMA looks like it
needs an update to mention the server?
--b.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 01/05] svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
2008-05-20 13:56 ` J. Bruce Fields
@ 2008-05-20 14:14 ` Talpey, Thomas
0 siblings, 0 replies; 11+ messages in thread
From: Talpey, Thomas @ 2008-05-20 14:14 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Talpey, Thomas, Tom Tucker, linux-nfs
At 09:56 AM 5/20/2008, J. Bruce Fields wrote:
>> MHO is that it's important but not urgent, and it should be part of a later
>> change. At Connectathon, no clients were sending any problematic requests,
>> so I think it's unlikely this will affect us in the wild, for now.
>
>Somewhere in the documentation, a really clear warning about the
>security assumptions would be useful. It could also help if the howto
>(on the web and in Documentation/filesystems/nfs-rdma.txt) included any
>instructions on necessary firewalling, etc.
Agreed. The kernel (/proc) parameters are part of this, and it's time to
spell them all out as well.
The protocol hardening we're talking about above isn't a security issue,
of course. It's just basic and part of the implementation. The client, btw,
has some fairly strict checking.
>By the way, the Kconfig help text for SUNRPC_XPRT_RDMA looks like it
>needs an update to mention the server?
You're right - it only mentions the client. I thought we added that text when
we simplified/collapsed the config.
Tom.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 02/05] svcrdma: Refactor RDMA_WRITE dma mapping logic
[not found] ` <1211156002624-git-send-email-tom@opengridcomputing.com>
[not found] ` <12111560022695-git-send-email-tom@opengridcomputing.com>
@ 2008-05-21 0:46 ` J. Bruce Fields
2008-05-21 2:52 ` J. Bruce Fields
2008-05-25 19:05 ` J. Bruce Fields
1 sibling, 2 replies; 11+ messages in thread
From: J. Bruce Fields @ 2008-05-21 0:46 UTC (permalink / raw)
To: Tom Tucker; +Cc: linux-nfs
On Sun, May 18, 2008 at 07:13:18PM -0500, Tom Tucker wrote:
> In order to make the dma mapping code easier to understand and reduce=
the
> number of I/O contexts required, move the DMA for RDMA_WRITE WR to th=
e
> code that prepares the WR SGE.
This one makes my (32-bit) compiler whine:
CC net/sunrpc/xprtrdma/svc_rdma_sendto.o
net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =E2=80=98send_write=E2=
=80=99:
net/sunrpc/xprtrdma/svc_rdma_sendto.c:180: warning: passing argument 2 =
of =E2=80=98ib_dma_map_single=E2=80=99 makes pointer from integer witho=
ut a cast
net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =E2=80=98send_reply=E2=
=80=99:
net/sunrpc/xprtrdma/svc_rdma_sendto.c:416: warning: passing argument 2 =
of =E2=80=98ib_dma_map_single=E2=80=99 makes pointer from integer witho=
ut a cast
And the types do seem weird:
>=20
> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>=20
> ---
> net/sunrpc/xprtrdma/svc_rdma_sendto.c | 44 +++++++++++++++-----=
----------
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 5 +++-
> 2 files changed, 26 insertions(+), 23 deletions(-)
>=20
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtr=
dma/svc_rdma_sendto.c
> index fb82b1b..85931c4 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> @@ -84,10 +84,7 @@ static struct ib_sge *xdr_to_sge(struct svcxprt_rd=
ma *xprt,
> sge_no =3D 1;
> =20
> /* Head SGE */
> - sge[sge_no].addr =3D ib_dma_map_single(xprt->sc_cm_id->device,
> - xdr->head[0].iov_base,
> - xdr->head[0].iov_len,
> - DMA_TO_DEVICE);
> + sge[sge_no].addr =3D (unsigned long)xdr->head[0].iov_base;
So before we called a function that took a void *, returned a u64; now
we're storing a void* directly into a u64.
> sge_bytes =3D min_t(u32, byte_count, xdr->head[0].iov_len);
> byte_count -=3D sge_bytes;
> sge[sge_no].length =3D sge_bytes;
> @@ -99,11 +96,9 @@ static struct ib_sge *xdr_to_sge(struct svcxprt_rd=
ma *xprt,
> page_bytes =3D xdr->page_len;
> page_off =3D xdr->page_base;
> while (byte_count && page_bytes) {
> + sge[sge_no].addr =3D (unsigned long)
> + page_address(xdr->pages[page_no]) + page_off;
> sge_bytes =3D min_t(u32, byte_count, (PAGE_SIZE-page_off));
> - sge[sge_no].addr =3D
> - ib_dma_map_page(xprt->sc_cm_id->device,
> - xdr->pages[page_no], page_off,
> - sge_bytes, DMA_TO_DEVICE);
> sge_bytes =3D min(sge_bytes, page_bytes);
> byte_count -=3D sge_bytes;
> page_bytes -=3D sge_bytes;
> @@ -117,11 +112,7 @@ static struct ib_sge *xdr_to_sge(struct svcxprt_=
rdma *xprt,
> =20
> /* Tail SGE */
> if (byte_count && xdr->tail[0].iov_len) {
> - sge[sge_no].addr =3D
> - ib_dma_map_single(xprt->sc_cm_id->device,
> - xdr->tail[0].iov_base,
> - xdr->tail[0].iov_len,
> - DMA_TO_DEVICE);
> + sge[sge_no].addr =3D (unsigned long)xdr->tail[0].iov_base;
> sge_bytes =3D min_t(u32, byte_count, xdr->tail[0].iov_len);
> byte_count -=3D sge_bytes;
> sge[sge_no].length =3D sge_bytes;
> @@ -145,7 +136,6 @@ static int send_write(struct svcxprt_rdma *xprt, =
struct svc_rqst *rqstp,
> u32 xdr_off, int write_len,
> struct ib_sge *xdr_sge, int sge_count)
> {
> - struct svc_rdma_op_ctxt *tmp_sge_ctxt;
> struct ib_send_wr write_wr;
> struct ib_sge *sge;
> int xdr_sge_no;
> @@ -163,9 +153,8 @@ static int send_write(struct svcxprt_rdma *xprt, =
struct svc_rqst *rqstp,
> write_len, xdr_sge, sge_count);
> =20
> ctxt =3D svc_rdma_get_context(xprt);
> - ctxt->count =3D 0;
> - tmp_sge_ctxt =3D svc_rdma_get_context(xprt);
> - sge =3D tmp_sge_ctxt->sge;
> + ctxt->direction =3D DMA_TO_DEVICE;
> + sge =3D ctxt->sge;
> =20
> /* Find the SGE associated with xdr_off */
> for (bc =3D xdr_off, xdr_sge_no =3D 1; bc && xdr_sge_no < sge_count=
;
> @@ -181,14 +170,20 @@ static int send_write(struct svcxprt_rdma *xprt=
, struct svc_rqst *rqstp,
> =20
> /* Copy the remaining SGE */
> while (bc !=3D 0 && xdr_sge_no < sge_count) {
> - sge[sge_no].addr =3D xdr_sge[xdr_sge_no].addr + sge_off;
> sge[sge_no].lkey =3D xdr_sge[xdr_sge_no].lkey;
> sge_bytes =3D min((size_t)bc,
> (size_t)(xdr_sge[xdr_sge_no].length-sge_off));
> sge[sge_no].length =3D sge_bytes;
> -
> + sge[sge_no].addr =3D
> + ib_dma_map_single(xprt->sc_cm_id->device,
> + (void *)
> + xdr_sge[xdr_sge_no].addr + sge_off,
> + sge_bytes, DMA_TO_DEVICE);
> + if (dma_mapping_error(sge[sge_no].addr))
> + return -EINVAL;
And then here we're casting the u64 back to a void *. Also, we're
adding sge_off to the input, instead of to the result. Is it true that
that
ib_dma_map_single(., x + sge_off, ., .)
and
ib_dma_map_single(., x, ., .) + sge_off
always have the same result?
--b.
> sge_off =3D 0;
> sge_no++;
> + ctxt->count++;
> xdr_sge_no++;
> bc -=3D sge_bytes;
> }
> @@ -210,11 +205,10 @@ static int send_write(struct svcxprt_rdma *xprt=
, struct svc_rqst *rqstp,
> /* Post It */
> atomic_inc(&rdma_stat_write);
> if (svc_rdma_send(xprt, &write_wr)) {
> - svc_rdma_put_context(ctxt, 1);
> + svc_rdma_put_context(ctxt, 0);
> /* Fatal error, close transport */
> ret =3D -EIO;
> }
> - svc_rdma_put_context(tmp_sge_ctxt, 0);
> return ret;
> }
> =20
> @@ -417,6 +411,11 @@ static int send_reply(struct svcxprt_rdma *rdma,
> sge_bytes =3D min((size_t)ctxt->sge[sge_no].length,
> (size_t)byte_count);
> byte_count -=3D sge_bytes;
> + ctxt->sge[sge_no].addr =3D
> + ib_dma_map_single(rdma->sc_cm_id->device,
> + (void *)
> + ctxt->sge[sge_no].addr,
> + sge_bytes, DMA_TO_DEVICE);
> }
> BUG_ON(byte_count !=3D 0);
> =20
> @@ -428,8 +427,9 @@ static int send_reply(struct svcxprt_rdma *rdma,
> ctxt->pages[page_no+1] =3D rqstp->rq_respages[page_no];
> ctxt->count++;
> rqstp->rq_respages[page_no] =3D NULL;
> + if (page_no+1 >=3D sge_no)
> + ctxt->sge[page_no+1].length =3D 0;
> }
> -
> BUG_ON(sge_no > rdma->sc_max_sge);
> memset(&send_wr, 0, sizeof send_wr);
> ctxt->wr_op =3D IB_WR_SEND;
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xp=
rtrdma/svc_rdma_transport.c
> index e132509..9e9e244 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> @@ -361,10 +361,13 @@ static void sq_cq_reap(struct svcxprt_rdma *xpr=
t)
> =20
> switch (ctxt->wr_op) {
> case IB_WR_SEND:
> - case IB_WR_RDMA_WRITE:
> svc_rdma_put_context(ctxt, 1);
> break;
> =20
> + case IB_WR_RDMA_WRITE:
> + svc_rdma_put_context(ctxt, 0);
> + break;
> +
> case IB_WR_RDMA_READ:
> if (test_bit(RDMACTXT_F_LAST_CTXT, &ctxt->flags)) {
> struct svc_rdma_op_ctxt *read_hdr =3D ctxt->read_hdr;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 02/05] svcrdma: Refactor RDMA_WRITE dma mapping logic
2008-05-21 0:46 ` [PATCH 02/05] svcrdma: Refactor RDMA_WRITE dma mapping logic J. Bruce Fields
@ 2008-05-21 2:52 ` J. Bruce Fields
2008-05-21 10:33 ` Tom Tucker
2008-05-25 19:05 ` J. Bruce Fields
1 sibling, 1 reply; 11+ messages in thread
From: J. Bruce Fields @ 2008-05-21 2:52 UTC (permalink / raw)
To: Tom Tucker; +Cc: linux-nfs
On Tue, May 20, 2008 at 08:46:08PM -0400, J. Bruce Fields wrote:
> On Sun, May 18, 2008 at 07:13:18PM -0500, Tom Tucker wrote:
> > In order to make the dma mapping code easier to understand and redu=
ce the
> > number of I/O contexts required, move the DMA for RDMA_WRITE WR to =
the
> > code that prepares the WR SGE.
>=20
> This one makes my (32-bit) compiler whine:
>=20
> CC net/sunrpc/xprtrdma/svc_rdma_sendto.o
> net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =E2=80=98send_writ=
e=E2=80=99:
> net/sunrpc/xprtrdma/svc_rdma_sendto.c:180: warning: passing argument =
2 of =E2=80=98ib_dma_map_single=E2=80=99 makes pointer from integer wit=
hout a cast
> net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =E2=80=98send_repl=
y=E2=80=99:
> net/sunrpc/xprtrdma/svc_rdma_sendto.c:416: warning: passing argument =
2 of =E2=80=98ib_dma_map_single=E2=80=99 makes pointer from integer wit=
hout a cast
Erm, sorry, that should have been:
unrpc/xprtrdma/svc_rdma_sendto.c: In function =E2=80=98send_write=E2=80=
=99:
net/sunrpc/xprtrdma/svc_rdma_sendto.c:180: warning: cast to pointer fro=
m integer of different size
net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =E2=80=98send_reply=E2=
=80=99:
net/sunrpc/xprtrdma/svc_rdma_sendto.c:417: warning: cast to pointer fro=
m integer of different size
--b.
> And the types do seem weird:
>=20
> >=20
> > Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
> >=20
> > ---
> > net/sunrpc/xprtrdma/svc_rdma_sendto.c | 44 +++++++++++++++---=
------------
> > net/sunrpc/xprtrdma/svc_rdma_transport.c | 5 +++-
> > 2 files changed, 26 insertions(+), 23 deletions(-)
> >=20
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xpr=
trdma/svc_rdma_sendto.c
> > index fb82b1b..85931c4 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
> > @@ -84,10 +84,7 @@ static struct ib_sge *xdr_to_sge(struct svcxprt_=
rdma *xprt,
> > sge_no =3D 1;
> > =20
> > /* Head SGE */
> > - sge[sge_no].addr =3D ib_dma_map_single(xprt->sc_cm_id->device,
> > - xdr->head[0].iov_base,
> > - xdr->head[0].iov_len,
> > - DMA_TO_DEVICE);
> > + sge[sge_no].addr =3D (unsigned long)xdr->head[0].iov_base;
>=20
> So before we called a function that took a void *, returned a u64; no=
w
> we're storing a void* directly into a u64.
>=20
> > sge_bytes =3D min_t(u32, byte_count, xdr->head[0].iov_len);
> > byte_count -=3D sge_bytes;
> > sge[sge_no].length =3D sge_bytes;
> > @@ -99,11 +96,9 @@ static struct ib_sge *xdr_to_sge(struct svcxprt_=
rdma *xprt,
> > page_bytes =3D xdr->page_len;
> > page_off =3D xdr->page_base;
> > while (byte_count && page_bytes) {
> > + sge[sge_no].addr =3D (unsigned long)
> > + page_address(xdr->pages[page_no]) + page_off;
> > sge_bytes =3D min_t(u32, byte_count, (PAGE_SIZE-page_off));
> > - sge[sge_no].addr =3D
> > - ib_dma_map_page(xprt->sc_cm_id->device,
> > - xdr->pages[page_no], page_off,
> > - sge_bytes, DMA_TO_DEVICE);
> > sge_bytes =3D min(sge_bytes, page_bytes);
> > byte_count -=3D sge_bytes;
> > page_bytes -=3D sge_bytes;
> > @@ -117,11 +112,7 @@ static struct ib_sge *xdr_to_sge(struct svcxpr=
t_rdma *xprt,
> > =20
> > /* Tail SGE */
> > if (byte_count && xdr->tail[0].iov_len) {
> > - sge[sge_no].addr =3D
> > - ib_dma_map_single(xprt->sc_cm_id->device,
> > - xdr->tail[0].iov_base,
> > - xdr->tail[0].iov_len,
> > - DMA_TO_DEVICE);
> > + sge[sge_no].addr =3D (unsigned long)xdr->tail[0].iov_base;
> > sge_bytes =3D min_t(u32, byte_count, xdr->tail[0].iov_len);
> > byte_count -=3D sge_bytes;
> > sge[sge_no].length =3D sge_bytes;
> > @@ -145,7 +136,6 @@ static int send_write(struct svcxprt_rdma *xprt=
, struct svc_rqst *rqstp,
> > u32 xdr_off, int write_len,
> > struct ib_sge *xdr_sge, int sge_count)
> > {
> > - struct svc_rdma_op_ctxt *tmp_sge_ctxt;
> > struct ib_send_wr write_wr;
> > struct ib_sge *sge;
> > int xdr_sge_no;
> > @@ -163,9 +153,8 @@ static int send_write(struct svcxprt_rdma *xprt=
, struct svc_rqst *rqstp,
> > write_len, xdr_sge, sge_count);
> > =20
> > ctxt =3D svc_rdma_get_context(xprt);
> > - ctxt->count =3D 0;
> > - tmp_sge_ctxt =3D svc_rdma_get_context(xprt);
> > - sge =3D tmp_sge_ctxt->sge;
> > + ctxt->direction =3D DMA_TO_DEVICE;
> > + sge =3D ctxt->sge;
> > =20
> > /* Find the SGE associated with xdr_off */
> > for (bc =3D xdr_off, xdr_sge_no =3D 1; bc && xdr_sge_no < sge_cou=
nt;
> > @@ -181,14 +170,20 @@ static int send_write(struct svcxprt_rdma *xp=
rt, struct svc_rqst *rqstp,
> > =20
> > /* Copy the remaining SGE */
> > while (bc !=3D 0 && xdr_sge_no < sge_count) {
> > - sge[sge_no].addr =3D xdr_sge[xdr_sge_no].addr + sge_off;
> > sge[sge_no].lkey =3D xdr_sge[xdr_sge_no].lkey;
> > sge_bytes =3D min((size_t)bc,
> > (size_t)(xdr_sge[xdr_sge_no].length-sge_off));
> > sge[sge_no].length =3D sge_bytes;
> > -
> > + sge[sge_no].addr =3D
> > + ib_dma_map_single(xprt->sc_cm_id->device,
> > + (void *)
> > + xdr_sge[xdr_sge_no].addr + sge_off,
> > + sge_bytes, DMA_TO_DEVICE);
> > + if (dma_mapping_error(sge[sge_no].addr))
> > + return -EINVAL;
>=20
> And then here we're casting the u64 back to a void *. Also, we're
> adding sge_off to the input, instead of to the result. Is it true th=
at
> that
>=20
> ib_dma_map_single(., x + sge_off, ., .)
>=20
> and
>=20
> ib_dma_map_single(., x, ., .) + sge_off
>=20
> always have the same result?
>=20
> --b.
>=20
> > sge_off =3D 0;
> > sge_no++;
> > + ctxt->count++;
> > xdr_sge_no++;
> > bc -=3D sge_bytes;
> > }
> > @@ -210,11 +205,10 @@ static int send_write(struct svcxprt_rdma *xp=
rt, struct svc_rqst *rqstp,
> > /* Post It */
> > atomic_inc(&rdma_stat_write);
> > if (svc_rdma_send(xprt, &write_wr)) {
> > - svc_rdma_put_context(ctxt, 1);
> > + svc_rdma_put_context(ctxt, 0);
> > /* Fatal error, close transport */
> > ret =3D -EIO;
> > }
> > - svc_rdma_put_context(tmp_sge_ctxt, 0);
> > return ret;
> > }
> > =20
> > @@ -417,6 +411,11 @@ static int send_reply(struct svcxprt_rdma *rdm=
a,
> > sge_bytes =3D min((size_t)ctxt->sge[sge_no].length,
> > (size_t)byte_count);
> > byte_count -=3D sge_bytes;
> > + ctxt->sge[sge_no].addr =3D
> > + ib_dma_map_single(rdma->sc_cm_id->device,
> > + (void *)
> > + ctxt->sge[sge_no].addr,
> > + sge_bytes, DMA_TO_DEVICE);
> > }
> > BUG_ON(byte_count !=3D 0);
> > =20
> > @@ -428,8 +427,9 @@ static int send_reply(struct svcxprt_rdma *rdma=
,
> > ctxt->pages[page_no+1] =3D rqstp->rq_respages[page_no];
> > ctxt->count++;
> > rqstp->rq_respages[page_no] =3D NULL;
> > + if (page_no+1 >=3D sge_no)
> > + ctxt->sge[page_no+1].length =3D 0;
> > }
> > -
> > BUG_ON(sge_no > rdma->sc_max_sge);
> > memset(&send_wr, 0, sizeof send_wr);
> > ctxt->wr_op =3D IB_WR_SEND;
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/=
xprtrdma/svc_rdma_transport.c
> > index e132509..9e9e244 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> > @@ -361,10 +361,13 @@ static void sq_cq_reap(struct svcxprt_rdma *x=
prt)
> > =20
> > switch (ctxt->wr_op) {
> > case IB_WR_SEND:
> > - case IB_WR_RDMA_WRITE:
> > svc_rdma_put_context(ctxt, 1);
> > break;
> > =20
> > + case IB_WR_RDMA_WRITE:
> > + svc_rdma_put_context(ctxt, 0);
> > + break;
> > +
> > case IB_WR_RDMA_READ:
> > if (test_bit(RDMACTXT_F_LAST_CTXT, &ctxt->flags)) {
> > struct svc_rdma_op_ctxt *read_hdr =3D ctxt->read_hdr;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 02/05] svcrdma: Refactor RDMA_WRITE dma mapping logic
2008-05-21 2:52 ` J. Bruce Fields
@ 2008-05-21 10:33 ` Tom Tucker
0 siblings, 0 replies; 11+ messages in thread
From: Tom Tucker @ 2008-05-21 10:33 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: linux-nfs
On 5/20/08 9:52 PM, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Tue, May 20, 2008 at 08:46:08PM -0400, J. Bruce Fields wrote:
>> On Sun, May 18, 2008 at 07:13:18PM -0500, Tom Tucker wrote:
>>> In order to make the dma mapping code easier to understand and redu=
ce the
>>> number of I/O contexts required, move the DMA for RDMA_WRITE WR to =
the
>>> code that prepares the WR SGE.
>>=20
>> This one makes my (32-bit) compiler whine:
>>=20
>> CC net/sunrpc/xprtrdma/svc_rdma_sendto.o
>> net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =8Csend_write=B9:
>> net/sunrpc/xprtrdma/svc_rdma_sendto.c:180: warning: passing argument=
2 of
>> =8Cib_dma_map_single=B9 makes pointer from integer without a cast
>> net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =8Csend_reply=B9:
>> net/sunrpc/xprtrdma/svc_rdma_sendto.c:416: warning: passing argument=
2 of
>> =8Cib_dma_map_single=B9 makes pointer from integer without a cast
>=20
> Erm, sorry, that should have been:
>=20
> unrpc/xprtrdma/svc_rdma_sendto.c: In function =8Csend_write=B9:
> net/sunrpc/xprtrdma/svc_rdma_sendto.c:180: warning: cast to pointer f=
rom
> integer of different size
> net/sunrpc/xprtrdma/svc_rdma_sendto.c: In function =8Csend_reply=B9:
> net/sunrpc/xprtrdma/svc_rdma_sendto.c:417: warning: cast to pointer f=
rom
> integer of different size
>
I have to jump on a plane, but I'll take a look tonight. Basically, the
ib_sge is sometimes being used to save pointers and other times dma_add=
r_t.
I need my own type there I guess.
Tom
=20
> --b.
>=20
>> And the types do seem weird:
>>=20
>>>=20
>>> Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
>>>=20
>>> ---
>>> net/sunrpc/xprtrdma/svc_rdma_sendto.c | 44
>>> +++++++++++++++---------------
>>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 5 +++-
>>> 2 files changed, 26 insertions(+), 23 deletions(-)
>>>=20
>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>>> b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>>> index fb82b1b..85931c4 100644
>>> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>>> @@ -84,10 +84,7 @@ static struct ib_sge *xdr_to_sge(struct svcxprt_=
rdma
>>> *xprt,
>>> sge_no =3D 1;
>>> =20
>>> /* Head SGE */
>>> - sge[sge_no].addr =3D ib_dma_map_single(xprt->sc_cm_id->device,
>>> - xdr->head[0].iov_base,
>>> - xdr->head[0].iov_len,
>>> - DMA_TO_DEVICE);
>>> + sge[sge_no].addr =3D (unsigned long)xdr->head[0].iov_base;
>>=20
>> So before we called a function that took a void *, returned a u64; n=
ow
>> we're storing a void* directly into a u64.
>>=20
>>> sge_bytes =3D min_t(u32, byte_count, xdr->head[0].iov_len);
>>> byte_count -=3D sge_bytes;
>>> sge[sge_no].length =3D sge_bytes;
>>> @@ -99,11 +96,9 @@ static struct ib_sge *xdr_to_sge(struct svcxprt_=
rdma
>>> *xprt,
>>> page_bytes =3D xdr->page_len;
>>> page_off =3D xdr->page_base;
>>> while (byte_count && page_bytes) {
>>> + sge[sge_no].addr =3D (unsigned long)
>>> + page_address(xdr->pages[page_no]) + page_off;
>>> sge_bytes =3D min_t(u32, byte_count, (PAGE_SIZE-page_off));
>>> - sge[sge_no].addr =3D
>>> - ib_dma_map_page(xprt->sc_cm_id->device,
>>> - xdr->pages[page_no], page_off,
>>> - sge_bytes, DMA_TO_DEVICE);
>>> sge_bytes =3D min(sge_bytes, page_bytes);
>>> byte_count -=3D sge_bytes;
>>> page_bytes -=3D sge_bytes;
>>> @@ -117,11 +112,7 @@ static struct ib_sge *xdr_to_sge(struct svcxpr=
t_rdma
>>> *xprt,
>>> =20
>>> /* Tail SGE */
>>> if (byte_count && xdr->tail[0].iov_len) {
>>> - sge[sge_no].addr =3D
>>> - ib_dma_map_single(xprt->sc_cm_id->device,
>>> - xdr->tail[0].iov_base,
>>> - xdr->tail[0].iov_len,
>>> - DMA_TO_DEVICE);
>>> + sge[sge_no].addr =3D (unsigned long)xdr->tail[0].iov_base;
>>> sge_bytes =3D min_t(u32, byte_count, xdr->tail[0].iov_len);
>>> byte_count -=3D sge_bytes;
>>> sge[sge_no].length =3D sge_bytes;
>>> @@ -145,7 +136,6 @@ static int send_write(struct svcxprt_rdma *xprt=
, struct
>>> svc_rqst *rqstp,
>>> u32 xdr_off, int write_len,
>>> struct ib_sge *xdr_sge, int sge_count)
>>> {
>>> - struct svc_rdma_op_ctxt *tmp_sge_ctxt;
>>> struct ib_send_wr write_wr;
>>> struct ib_sge *sge;
>>> int xdr_sge_no;
>>> @@ -163,9 +153,8 @@ static int send_write(struct svcxprt_rdma *xprt=
, struct
>>> svc_rqst *rqstp,
>>> write_len, xdr_sge, sge_count);
>>> =20
>>> ctxt =3D svc_rdma_get_context(xprt);
>>> - ctxt->count =3D 0;
>>> - tmp_sge_ctxt =3D svc_rdma_get_context(xprt);
>>> - sge =3D tmp_sge_ctxt->sge;
>>> + ctxt->direction =3D DMA_TO_DEVICE;
>>> + sge =3D ctxt->sge;
>>> =20
>>> /* Find the SGE associated with xdr_off */
>>> for (bc =3D xdr_off, xdr_sge_no =3D 1; bc && xdr_sge_no < sge_count=
;
>>> @@ -181,14 +170,20 @@ static int send_write(struct svcxprt_rdma *xp=
rt,
>>> struct svc_rqst *rqstp,
>>> =20
>>> /* Copy the remaining SGE */
>>> while (bc !=3D 0 && xdr_sge_no < sge_count) {
>>> - sge[sge_no].addr =3D xdr_sge[xdr_sge_no].addr + sge_off;
>>> sge[sge_no].lkey =3D xdr_sge[xdr_sge_no].lkey;
>>> sge_bytes =3D min((size_t)bc,
>>> (size_t)(xdr_sge[xdr_sge_no].length-sge_off));
>>> sge[sge_no].length =3D sge_bytes;
>>> -
>>> + sge[sge_no].addr =3D
>>> + ib_dma_map_single(xprt->sc_cm_id->device,
>>> + (void *)
>>> + xdr_sge[xdr_sge_no].addr + sge_off,
>>> + sge_bytes, DMA_TO_DEVICE);
>>> + if (dma_mapping_error(sge[sge_no].addr))
>>> + return -EINVAL;
>>=20
>> And then here we're casting the u64 back to a void *. Also, we're
>> adding sge_off to the input, instead of to the result. Is it true t=
hat
>> that
>>=20
>> ib_dma_map_single(., x + sge_off, ., .)
>>=20
>> and
>>=20
>> ib_dma_map_single(., x, ., .) + sge_off
>>=20
>> always have the same result?
>>=20
>> --b.
>>=20
>>> sge_off =3D 0;
>>> sge_no++;
>>> + ctxt->count++;
>>> xdr_sge_no++;
>>> bc -=3D sge_bytes;
>>> }
>>> @@ -210,11 +205,10 @@ static int send_write(struct svcxprt_rdma *xp=
rt,
>>> struct svc_rqst *rqstp,
>>> /* Post It */
>>> atomic_inc(&rdma_stat_write);
>>> if (svc_rdma_send(xprt, &write_wr)) {
>>> - svc_rdma_put_context(ctxt, 1);
>>> + svc_rdma_put_context(ctxt, 0);
>>> /* Fatal error, close transport */
>>> ret =3D -EIO;
>>> }
>>> - svc_rdma_put_context(tmp_sge_ctxt, 0);
>>> return ret;
>>> }
>>> =20
>>> @@ -417,6 +411,11 @@ static int send_reply(struct svcxprt_rdma *rdm=
a,
>>> sge_bytes =3D min((size_t)ctxt->sge[sge_no].length,
>>> (size_t)byte_count);
>>> byte_count -=3D sge_bytes;
>>> + ctxt->sge[sge_no].addr =3D
>>> + ib_dma_map_single(rdma->sc_cm_id->device,
>>> + (void *)
>>> + ctxt->sge[sge_no].addr,
>>> + sge_bytes, DMA_TO_DEVICE);
>>> }
>>> BUG_ON(byte_count !=3D 0);
>>> =20
>>> @@ -428,8 +427,9 @@ static int send_reply(struct svcxprt_rdma *rdma=
,
>>> ctxt->pages[page_no+1] =3D rqstp->rq_respages[page_no];
>>> ctxt->count++;
>>> rqstp->rq_respages[page_no] =3D NULL;
>>> + if (page_no+1 >=3D sge_no)
>>> + ctxt->sge[page_no+1].length =3D 0;
>>> }
>>> -
>>> BUG_ON(sge_no > rdma->sc_max_sge);
>>> memset(&send_wr, 0, sizeof send_wr);
>>> ctxt->wr_op =3D IB_WR_SEND;
>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c
>>> b/net/sunrpc/xprtrdma/svc_rdma_transport.c
>>> index e132509..9e9e244 100644
>>> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
>>> @@ -361,10 +361,13 @@ static void sq_cq_reap(struct svcxprt_rdma *x=
prt)
>>> =20
>>> switch (ctxt->wr_op) {
>>> case IB_WR_SEND:
>>> - case IB_WR_RDMA_WRITE:
>>> svc_rdma_put_context(ctxt, 1);
>>> break;
>>> =20
>>> + case IB_WR_RDMA_WRITE:
>>> + svc_rdma_put_context(ctxt, 0);
>>> + break;
>>> +
>>> case IB_WR_RDMA_READ:
>>> if (test_bit(RDMACTXT_F_LAST_CTXT, &ctxt->flags)) {
>>> struct svc_rdma_op_ctxt *read_hdr =3D ctxt->read_hdr;
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"=
in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 02/05] svcrdma: Refactor RDMA_WRITE dma mapping logic
2008-05-21 0:46 ` [PATCH 02/05] svcrdma: Refactor RDMA_WRITE dma mapping logic J. Bruce Fields
2008-05-21 2:52 ` J. Bruce Fields
@ 2008-05-25 19:05 ` J. Bruce Fields
1 sibling, 0 replies; 11+ messages in thread
From: J. Bruce Fields @ 2008-05-25 19:05 UTC (permalink / raw)
To: Tom Tucker; +Cc: linux-nfs
I'm also still curious about this:
On Tue, May 20, 2008 at 08:46:08PM -0400, bfields wrote:
> On Sun, May 18, 2008 at 07:13:18PM -0500, Tom Tucker wrote:
> > @@ -181,14 +170,20 @@ static int send_write(struct svcxprt_rdma *xprt, struct svc_rqst *rqstp,
> >
> > /* Copy the remaining SGE */
> > while (bc != 0 && xdr_sge_no < sge_count) {
> > - sge[sge_no].addr = xdr_sge[xdr_sge_no].addr + sge_off;
> > sge[sge_no].lkey = xdr_sge[xdr_sge_no].lkey;
> > sge_bytes = min((size_t)bc,
> > (size_t)(xdr_sge[xdr_sge_no].length-sge_off));
> > sge[sge_no].length = sge_bytes;
> > -
> > + sge[sge_no].addr =
> > + ib_dma_map_single(xprt->sc_cm_id->device,
> > + (void *)
> > + xdr_sge[xdr_sge_no].addr + sge_off,
> > + sge_bytes, DMA_TO_DEVICE);
> > + if (dma_mapping_error(sge[sge_no].addr))
> > + return -EINVAL;
>
> And then here we're casting the u64 back to a void *. Also, we're
> adding sge_off to the input, instead of to the result. Is it true that
> that
>
> ib_dma_map_single(., x + sge_off, ., .)
>
> and
>
> ib_dma_map_single(., x, ., .) + sge_off
>
> always have the same result?
>
> --b.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-05-25 19:05 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <12111560011694-git-send-email-tom@opengridcomputing.com>
[not found] ` <12111560022506-git-send-email-tom@opengridcomputing.com>
2008-05-19 18:20 ` [PATCH 01/05] svcrdma: Verify read-list fits within RPCSVC_MAXPAGES J. Bruce Fields
2008-05-20 1:07 ` Tom Tucker
[not found] ` <1211245672.31725.111.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
2008-05-20 13:27 ` Talpey, Thomas
[not found] ` <RTPCLUEXC1-PRDh133t00000127-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
2008-05-20 13:56 ` J. Bruce Fields
2008-05-20 14:14 ` Talpey, Thomas
[not found] ` <1211156002624-git-send-email-tom@opengridcomputing.com>
[not found] ` <12111560022695-git-send-email-tom@opengridcomputing.com>
[not found] ` <12111560022073-git-send-email-tom@opengridcomputing.com>
[not found] ` <12111560023250-git-send-email-tom@opengridcomputing.com>
2008-05-19 19:18 ` [PATCH 05/05] svcrdma: Add dma map count and WARN_ON J. Bruce Fields
2008-05-19 19:27 ` Tom Tucker
2008-05-21 0:46 ` [PATCH 02/05] svcrdma: Refactor RDMA_WRITE dma mapping logic J. Bruce Fields
2008-05-21 2:52 ` J. Bruce Fields
2008-05-21 10:33 ` Tom Tucker
2008-05-25 19:05 ` J. Bruce Fields
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.