public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Devesh Sharma
	<Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org>,
	Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	"linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Cc: "Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org"
	<Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH V3 01/17] xprtrdma: mind the device's max fast register page	list depth
Date: Fri, 16 May 2014 09:29:03 -0500	[thread overview]
Message-ID: <537620AF.3010307@opengridcomputing.com> (raw)
In-Reply-To: <53761D28.3070704-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

Looks like ocrdma does this.  See ocrdma_query_device().  It advertises 
IB_DEVICE_MEM_MGT_EXTENSIONS but sets max_fast_reg_page_list_len to 0.  
The Verbs spec sez if you advertise the mem extensions, then you need to 
support all of them.   Is this just a bug in the driver?  Or does it 
really not support FRMRs?

Steve.


On 5/16/2014 9:14 AM, Steve Wise wrote:
> By the way, Devesh:  Is the device advertising FRMR support, yet 
> setting the max page list len to zero?  That's a driver bug...
>
>
>
> On 5/16/2014 9:10 AM, Steve Wise wrote:
>> I guess the client code doesn't verify that the device supports the 
>> chosen memreg mode.  That's not good.   Lemme fix this and respin 
>> this patch.
>>
>>
>> On 5/16/2014 2:08 AM, Devesh Sharma wrote:
>>> Chuck
>>>
>>> This patch is causing a CPU soft-lockup if underlying vendor reports 
>>> devattr.max_fast_reg_pagr_list_len = 0 and ia->ri_memreg_strategy = 
>>> FRMR (Default option).
>>> I think there is need to refer to device capability flags. If 
>>> strategy = FRMR is forced and devattr.max_fast_reg_pagr_list_len=0 
>>> then flash an error and fail RPC with -EIO.
>>>
>>> See inline:
>>>
>>>> -----Original Message-----
>>>> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
>>>> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Chuck Lever
>>>> Sent: Thursday, May 01, 2014 1:00 AM
>>>> To: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>> Cc: Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org
>>>> Subject: [PATCH V3 01/17] xprtrdma: mind the device's max fast 
>>>> register
>>>> page list depth
>>>>
>>>> From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
>>>>
>>>> Some rdma devices don't support a fast register page list depth of 
>>>> at least
>>>> RPCRDMA_MAX_DATA_SEGS.  So xprtrdma needs to chunk its fast register
>>>> regions according to the minimum of the device max supported depth or
>>>> RPCRDMA_MAX_DATA_SEGS.
>>>>
>>>> Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
>>>> Reviewed-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>>>> ---
>>>>
>>>>   net/sunrpc/xprtrdma/rpc_rdma.c  |    4 ---
>>>>   net/sunrpc/xprtrdma/verbs.c     |   47 
>>>> +++++++++++++++++++++++++++++-
>>>> ---------
>>>>   net/sunrpc/xprtrdma/xprt_rdma.h |    1 +
>>>>   3 files changed, 36 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> b/net/sunrpc/xprtrdma/rpc_rdma.c index 96ead52..400aa1b 100644
>>>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> @@ -248,10 +248,6 @@ rpcrdma_create_chunks(struct rpc_rqst *rqst, 
>>>> struct
>>>> xdr_buf *target,
>>>>       /* success. all failures return above */
>>>>       req->rl_nchunks = nchunks;
>>>>
>>>> -    BUG_ON(nchunks == 0);
>>>> -    BUG_ON((r_xprt->rx_ia.ri_memreg_strategy == RPCRDMA_FRMR)
>>>> -           && (nchunks > 3));
>>>> -
>>>>       /*
>>>>        * finish off header. If write, marshal discrim and nchunks.
>>>>        */
>>>> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
>>>> index 9372656..55fb09a 100644
>>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>>> @@ -539,6 +539,11 @@ rpcrdma_ia_open(struct rpcrdma_xprt *xprt, struct
>>>> sockaddr *addr, int memreg)
>>>>                   __func__);
>>>>               memreg = RPCRDMA_REGISTER;
>>>>   #endif
>>>> +        } else {
>>>> +            /* Mind the ia limit on FRMR page list depth */
>>>> +            ia->ri_max_frmr_depth = min_t(unsigned int,
>>>> +                RPCRDMA_MAX_DATA_SEGS,
>>>> +                devattr.max_fast_reg_page_list_len);
>>>>           }
>>>>           break;
>>>>       }
>>>> @@ -659,24 +664,42 @@ rpcrdma_ep_create(struct rpcrdma_ep *ep, struct
>>>> rpcrdma_ia *ia,
>>>>       ep->rep_attr.srq = NULL;
>>>>       ep->rep_attr.cap.max_send_wr = cdata->max_requests;
>>>>       switch (ia->ri_memreg_strategy) {
>>>> -    case RPCRDMA_FRMR:
>>>> +    case RPCRDMA_FRMR: {
>>>> +        int depth = 7;
>>>> +
>>>>           /* Add room for frmr register and invalidate WRs.
>>>>            * 1. FRMR reg WR for head
>>>>            * 2. FRMR invalidate WR for head
>>>> -         * 3. FRMR reg WR for pagelist
>>>> -         * 4. FRMR invalidate WR for pagelist
>>>> +         * 3. N FRMR reg WRs for pagelist
>>>> +         * 4. N FRMR invalidate WRs for pagelist
>>>>            * 5. FRMR reg WR for tail
>>>>            * 6. FRMR invalidate WR for tail
>>>>            * 7. The RDMA_SEND WR
>>>>            */
>>>> -        ep->rep_attr.cap.max_send_wr *= 7;
>>>> +
>>>> +        /* Calculate N if the device max FRMR depth is smaller than
>>>> +         * RPCRDMA_MAX_DATA_SEGS.
>>>> +         */
>>>> +        if (ia->ri_max_frmr_depth < RPCRDMA_MAX_DATA_SEGS) {
>>>> +            int delta = RPCRDMA_MAX_DATA_SEGS -
>>>> +                    ia->ri_max_frmr_depth;
>>>> +
>>>> +            do {
>>>> +                depth += 2; /* FRMR reg + invalidate */
>>>> +                delta -= ia->ri_max_frmr_depth;
>>> If ia->ri_max_frmr_depth is = 0. This loop becomes infinite loop.
>>>
>>>> +            } while (delta > 0);
>>>> +
>>>> +        }
>>>> +        ep->rep_attr.cap.max_send_wr *= depth;
>>>>           if (ep->rep_attr.cap.max_send_wr > devattr.max_qp_wr) {
>>>> -            cdata->max_requests = devattr.max_qp_wr / 7;
>>>> +            cdata->max_requests = devattr.max_qp_wr / depth;
>>>>               if (!cdata->max_requests)
>>>>                   return -EINVAL;
>>>> -            ep->rep_attr.cap.max_send_wr = cdata-
>>>>> max_requests * 7;
>>>> +            ep->rep_attr.cap.max_send_wr = cdata-
>>>>> max_requests *
>>>> +                               depth;
>>>>           }
>>>>           break;
>>>> +    }
>>>>       case RPCRDMA_MEMWINDOWS_ASYNC:
>>>>       case RPCRDMA_MEMWINDOWS:
>>>>           /* Add room for mw_binds+unbinds - overkill! */ @@ -
>>>> 1043,16 +1066,16 @@ rpcrdma_buffer_create(struct rpcrdma_buffer *buf,
>>>> struct rpcrdma_ep *ep,
>>>>       case RPCRDMA_FRMR:
>>>>           for (i = buf->rb_max_requests * RPCRDMA_MAX_SEGS; i; i--
>>>> ) {
>>>>               r->r.frmr.fr_mr = ib_alloc_fast_reg_mr(ia->ri_pd,
>>>> -
>>>> RPCRDMA_MAX_SEGS);
>>>> +                        ia->ri_max_frmr_depth);
>>>>               if (IS_ERR(r->r.frmr.fr_mr)) {
>>>>                   rc = PTR_ERR(r->r.frmr.fr_mr);
>>>>                   dprintk("RPC:       %s: ib_alloc_fast_reg_mr"
>>>>                       " failed %i\n", __func__, rc);
>>>>                   goto out;
>>>>               }
>>>> -            r->r.frmr.fr_pgl =
>>>> -                ib_alloc_fast_reg_page_list(ia->ri_id-
>>>>> device,
>>>> -
>>>> RPCRDMA_MAX_SEGS);
>>>> +            r->r.frmr.fr_pgl = ib_alloc_fast_reg_page_list(
>>>> +                        ia->ri_id->device,
>>>> +                        ia->ri_max_frmr_depth);
>>>>               if (IS_ERR(r->r.frmr.fr_pgl)) {
>>>>                   rc = PTR_ERR(r->r.frmr.fr_pgl);
>>>>                   dprintk("RPC:       %s: "
>>>> @@ -1498,8 +1521,8 @@ rpcrdma_register_frmr_external(struct
>>>> rpcrdma_mr_seg *seg,
>>>>       seg1->mr_offset -= pageoff;    /* start of page */
>>>>       seg1->mr_len += pageoff;
>>>>       len = -pageoff;
>>>> -    if (*nsegs > RPCRDMA_MAX_DATA_SEGS)
>>>> -        *nsegs = RPCRDMA_MAX_DATA_SEGS;
>>>> +    if (*nsegs > ia->ri_max_frmr_depth)
>>>> +        *nsegs = ia->ri_max_frmr_depth;
>>>>       for (page_no = i = 0; i < *nsegs;) {
>>>>           rpcrdma_map_one(ia, seg, writing);
>>>>           pa = seg->mr_dma;
>>>> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h
>>>> b/net/sunrpc/xprtrdma/xprt_rdma.h index cc1445d..98340a3 100644
>>>> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
>>>> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
>>>> @@ -66,6 +66,7 @@ struct rpcrdma_ia {
>>>>       struct completion    ri_done;
>>>>       int            ri_async_rc;
>>>>       enum rpcrdma_memreg    ri_memreg_strategy;
>>>> +    unsigned int        ri_max_frmr_depth;
>>>>   };
>>>>
>>>>   /*
>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> linux-rdma" in the
>>>> body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at
>>>> http://vger.kernel.org/majordomo-info.html
>>> N�����r��y���b�X��ǧv�^�)޺{.n�+����{���"��^n�r���z�\x1a 
>>> ��h����&��\x1e �G���h�\x03(�階�ݢj"��\x1a�^[m�����z�ޖ���f���h���~�mml==
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-05-16 14:29 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-30 19:29 [PATCH V3 00/17] NFS/RDMA client-side patches Chuck Lever
     [not found] ` <20140430191433.5663.16217.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2014-04-30 19:29   ` [PATCH V3 01/17] xprtrdma: mind the device's max fast register page list depth Chuck Lever
     [not found]     ` <20140430192936.5663.66537.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2014-05-16  7:08       ` Devesh Sharma
     [not found]         ` <EE7902D3F51F404C82415C4803930ACD3FDFBDA9-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-05-16 14:10           ` Steve Wise
     [not found]             ` <53761C63.4050908-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-05-16 14:14               ` Steve Wise
     [not found]                 ` <53761D28.3070704-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-05-16 14:29                   ` Steve Wise [this message]
     [not found]                     ` <537620AF.3010307-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-05-17  8:23                       ` Devesh Sharma
2014-04-30 19:29   ` [PATCH V3 02/17] nfs-rdma: Fix for FMR leaks Chuck Lever
2014-04-30 19:29   ` [PATCH V3 03/17] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context Chuck Lever
2014-04-30 19:30   ` [PATCH V3 04/17] xprtrdma: Remove BOUNCEBUFFERS memory registration mode Chuck Lever
2014-04-30 19:30   ` [PATCH V3 05/17] xprtrdma: Remove MEMWINDOWS registration modes Chuck Lever
2014-04-30 19:30   ` [PATCH V3 06/17] xprtrdma: Remove REGISTER memory registration mode Chuck Lever
2014-04-30 19:30   ` [PATCH V3 07/17] xprtrdma: Fall back to MTHCAFMR when FRMR is not supported Chuck Lever
2014-04-30 19:30   ` [PATCH V3 08/17] xprtrdma: mount reports "Invalid mount option" if memreg mode " Chuck Lever
2014-04-30 19:30   ` [PATCH V3 09/17] xprtrdma: Simplify rpcrdma_deregister_external() synopsis Chuck Lever
2014-04-30 19:30   ` [PATCH V3 10/17] xprtrdma: Make rpcrdma_ep_destroy() return void Chuck Lever
2014-04-30 19:31   ` [PATCH V3 11/17] xprtrdma: Split the completion queue Chuck Lever
2014-04-30 19:31   ` [PATCH V3 12/17] xprtrmda: Reduce lock contention in completion handlers Chuck Lever
2014-04-30 19:31   ` [PATCH V3 13/17] xprtrmda: Reduce calls to ib_poll_cq() " Chuck Lever
2014-04-30 19:31   ` [PATCH V3 14/17] xprtrdma: Limit work done by completion handler Chuck Lever
2014-04-30 19:31   ` [PATCH V3 15/17] xprtrdma: Reduce the number of hardway buffer allocations Chuck Lever
2014-04-30 19:31   ` [PATCH V3 16/17] xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting Chuck Lever
2014-04-30 19:31   ` [PATCH V3 17/17] xprtrdma: Remove Tavor MTU setting Chuck Lever
     [not found]     ` <20140430193155.5663.86148.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2014-05-01  7:36       ` Hal Rosenstock
2014-05-02 19:27   ` [PATCH V3 00/17] NFS/RDMA client-side patches Doug Ledford
2014-05-02 19:27   ` Doug Ledford
     [not found] ` <5363f223.e39f420a.4af6.6fc9SMTPIN_ADDED_BROKEN@mx.google.com>
     [not found]   ` <5363f223.e39f420a.4af6.6fc9SMTPIN_ADDED_BROKEN-ATjtLOhZ0NVl57MIdRCFDg@public.gmane.org>
2014-05-02 20:20     ` Chuck Lever
     [not found]       ` <45067B04-660C-4971-B12F-AEC9F7D32785-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-05-02 22:34         ` Doug Ledford
2014-05-02 22:34         ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=537620AF.3010307@opengridcomputing.com \
    --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=Anna.Schumaker-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org \
    --cc=Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox