Re: [PATCH 7/8] xprtrdma: Split the completion queue

public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed

From: Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: Steve Wise
	<swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>,
	Linux NFS Mailing List
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH 7/8] xprtrdma: Split the completion queue
Date: Sun, 20 Apr 2014 15:42:55 +0300	[thread overview]
Message-ID: <5353C0CF.9080806@dev.mellanox.co.il> (raw)
In-Reply-To: <593D9BFA-714E-417F-ACA0-05594290C4D1-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

On 4/19/2014 7:31 PM, Chuck Lever wrote:
> Hi Sagi-
>
> On Apr 17, 2014, at 3:11 PM, Sagi Grimberg <sagig-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>
>> On 4/17/2014 5:34 PM, Steve Wise wrote:
>>
>> <SNIP>
>>> You could use a small array combined with a loop and a budget count.  So the code would
>>> grab, say, 4 at a time, and keep looping polling up to 4 until the CQ is empty or the
>>> desired budget is reached...
>> Bingo... couldn't agree more.
>>
>> Poll Arrays are a nice optimization,
> Typically, a provider's poll_cq implementation takes the CQ lock
> using spin_lock_irqsave().  My goal of using a poll array is to
> reduce the number of times the completion handler invokes
> spin_lock_irqsave / spin_unlock_irqsave pairs when draining a
> large queue.

Yes, hence the optimization.

>
>> but large arrays will just burden the stack (and might even make things worse in high workloads...)
>
> My prototype moves the poll array off the stack and into allocated
> storage.  Making that array as large as a single page would be
> sufficient for 50 or more ib_wc structures on a platform with 4KB
> pages and 64-bit addresses.

You assume here the worst-case workload. In the sparse case you are 
carrying around a redundant storage space...
I would recommend using say 16 wc array or so.

> The xprtrdma completion handler polls twice:
>
>    1.  Drain the CQ completely
>
>    2.  Re-arm
>
>    3.  Drain the CQ completely again
>
> So between steps 1. and 3. a single notification could handle over
> 100 WCs, if we were to budget by draining just a single array's worth
> during each step. (Btw, I'm not opposed to looping while polling
> arrays. This is just an example for discussion).
>
> As for budgeting itself, I wonder if there is a possibility of losing
> notifications.  The purpose of re-arming and then draining again is to
> ensure that any items queued after step 1. and before step 2. are
> captured, as by themselves they would never generate an upcall
> notification, IIUC.

I don't think there is a possibility for implicit loss of completions. 
HCAs that may miss completions
should respond to ib_req_notify flag IB_CQ_REPORT_MISSED_EVENTS.
/**
  * ib_req_notify_cq - Request completion notification on a CQ.
  * @cq: The CQ to generate an event for.
  * @flags:
  *   Must contain exactly one of %IB_CQ_SOLICITED or %IB_CQ_NEXT_COMP
  *   to request an event on the next solicited event or next work
  *   completion at any type, respectively. %IB_CQ_REPORT_MISSED_EVENTS
  *   may also be |ed in to request a hint about missed events, as
  *   described below.
  *
  * Return Value:
  *    < 0 means an error occurred while requesting notification
  *   == 0 means notification was requested successfully, and if
  *        IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events
  *        were missed and it is safe to wait for another event.  In
  *        this case is it guaranteed that any work completions added
  *        to the CQ since the last CQ poll will trigger a completion
  *        notification event.
  *    > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed
  *        in.  It means that the consumer must poll the CQ again to
  *        make sure it is empty to avoid missing an event because of a
  *        race between requesting notification and an entry being
  *        added to the CQ.  This return value means it is possible
  *        (but not guaranteed) that a work completion has been added
  *        to the CQ since the last poll without triggering a
  *        completion notification event.
  */

Other than that, if one stops polling and requests notify - he should be 
invoked again from the
correct producer index (i.e. no missed events).

Hope this helps,

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2014-04-20 12:42 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-14 22:22 [PATCH 0/8] NFS/RDMA patches for review Chuck Lever
     [not found] ` <20140414220041.20646.63991.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2014-04-14 22:22   ` [PATCH 1/8] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context Chuck Lever
2014-04-14 22:22   ` [PATCH 2/8] xprtrdma: Remove BOUNCEBUFFERS memory registration mode Chuck Lever
2014-04-14 22:22   ` [PATCH 3/8] xprtrdma: Disable ALLPHYSICAL mode by default Chuck Lever
2014-04-14 22:22   ` [PATCH 4/8] xprtrdma: Remove support for MEMWINDOWS registration mode Chuck Lever
2014-04-14 22:23   ` [PATCH 5/8] xprtrdma: Simplify rpcrdma_deregister_external() synopsis Chuck Lever
2014-04-14 22:23   ` [PATCH 6/8] xprtrdma: Make rpcrdma_ep_destroy() return void Chuck Lever
2014-04-14 22:23   ` [PATCH 7/8] xprtrdma: Split the completion queue Chuck Lever
     [not found]     ` <20140414222323.20646.66946.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2014-04-16 12:48       ` Sagi Grimberg
     [not found]         ` <534E7C1C.5070407-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-16 13:30           ` Steve Wise
     [not found]             ` <534E8608.8030801-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-04-16 14:12               ` Sagi Grimberg
     [not found]                 ` <534E8FCE.909-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-16 14:25                   ` Steve Wise
2014-04-16 14:35                     ` Sagi Grimberg
     [not found]                       ` <534E9534.9020004-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-16 14:43                         ` Steve Wise
2014-04-16 15:18                           ` Sagi Grimberg
     [not found]                             ` <534E9F40.8000905-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-16 15:46                               ` Steve Wise
2014-04-16 15:08               ` Chuck Lever
     [not found]                 ` <E9B601B5-1984-40D8-914A-DDD1380E8183-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-16 15:23                   ` Sagi Grimberg
     [not found]                     ` <534EA06A.7090200-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-16 18:21                       ` Chuck Lever
     [not found]                         ` <FC61F219-ACA2-4CE8-BF02-1B8EE2464639-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-17  7:06                           ` Sagi Grimberg
     [not found]                             ` <534F7D5F.1090908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-17 13:55                               ` Chuck Lever
     [not found]                                 ` <A7E4B101-BAF0-480C-95AE-26BB845EE2C3-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-17 14:34                                   ` Steve Wise
2014-04-17 19:11                                     ` Sagi Grimberg
     [not found]                                       ` <5350277C.20608-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-19 16:31                                         ` Chuck Lever
     [not found]                                           ` <593D9BFA-714E-417F-ACA0-05594290C4D1-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-20 12:42                                             ` Sagi Grimberg [this message]
2014-04-17 19:08                                   ` Sagi Grimberg
2014-04-14 22:23   ` [PATCH 8/8] xprtrdma: Reduce the number of hardway buffer allocations Chuck Lever
2014-04-15 20:15   ` [PATCH 0/8] NFS/RDMA patches for review Steve Wise
     [not found]     ` <534D9373.3050406-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-04-15 20:20       ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5353C0CF.9080806@dev.mellanox.co.il \
    --to=sagig-ldsdmyg8hgv8yrgs2mwiifqbs+8scbdb@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox