From: Chuck Lever <chuck.lever@oracle.com>
To: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: Steve Wise <swise@opengridcomputing.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
linux-rdma@vger.kernel.org
Subject: Re: [PATCH 7/8] xprtrdma: Split the completion queue
Date: Wed, 16 Apr 2014 14:21:33 -0400 [thread overview]
Message-ID: <FC61F219-ACA2-4CE8-BF02-1B8EE2464639@oracle.com> (raw)
In-Reply-To: <534EA06A.7090200@dev.mellanox.co.il>
On Apr 16, 2014, at 11:23 AM, Sagi Grimberg <sagig@dev.mellanox.co.il> wrote:
> On 4/16/2014 6:08 PM, Chuck Lever wrote:
>> Hi Sagi-
>>
>> Thanks for the review! In-line replies below.
> <SNIP>
>>
>>>> 2. If I understood correctly, I see that the CQs are loop-polled in an interrupt context.
>>>> This may become problematic in stress workload where the CQ simply never drains (imagine
>>>> some studios task keeps posting WRs as soon as IOs complete). This will cause CQ poll loop
>>>> to go on forever. This situation may cause starvation of other CQs that share the same EQ (CPU
>>>> is hogged by the endless poller).
>>>> This may be solved in 2 ways:
>>>> - Use blk-iopoll to maintain fairness - the downside will be moving from interrupt context to soft-irq (slower).
>>>> - Set some configurable budget to CQ poller - after finishing the budget, notify the CQ and bail.
>>>> If there is another CQ waiting to get in - it's only fair that it will be given with a chance, else another interrupt
>>>> on that CQ will immediately come.
>> I think it would be a reasonable change to pass an array of WC’s to
>> ib_poll_cq() just once in rpcrdma_{send,recv}cq_poll(), instead of
>> looping. Would that be enough?
>>
>> To be clear, my patch merely cleans up the completion handler logic,
>> which already did loop-polling. Should we consider this improvement
>> for another patch?
>
> Well, I wasn't suggesting passing an array, carrying it around (or on the stack for that manner) might be annoying...
> I was suggesting a budget (poll loops or a time bound - jiffy is usually well behaved).
Passing a small array to ip_poll_cq() is actually easy to do, and is
exactly equivalent to a poll budget. The struct ib_wc should be taken
off the stack anyway, IMO.
The only other example I see in 3.15 right now is IPoIB, which seems
to do exactly this.
I’m testing a patch now. I’d like to start simple and make it more
complex only if we need to.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
next prev parent reply other threads:[~2014-04-16 18:21 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-14 22:22 [PATCH 0/8] NFS/RDMA patches for review Chuck Lever
2014-04-14 22:22 ` [PATCH 1/8] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context Chuck Lever
2014-04-14 22:22 ` [PATCH 2/8] xprtrdma: Remove BOUNCEBUFFERS memory registration mode Chuck Lever
2014-04-14 22:22 ` [PATCH 3/8] xprtrdma: Disable ALLPHYSICAL mode by default Chuck Lever
2014-04-14 22:22 ` [PATCH 4/8] xprtrdma: Remove support for MEMWINDOWS registration mode Chuck Lever
2014-04-14 22:23 ` [PATCH 5/8] xprtrdma: Simplify rpcrdma_deregister_external() synopsis Chuck Lever
2014-04-14 22:23 ` [PATCH 6/8] xprtrdma: Make rpcrdma_ep_destroy() return void Chuck Lever
2014-04-14 22:23 ` [PATCH 7/8] xprtrdma: Split the completion queue Chuck Lever
2014-04-16 12:48 ` Sagi Grimberg
2014-04-16 13:30 ` Steve Wise
2014-04-16 14:12 ` Sagi Grimberg
2014-04-16 14:25 ` Steve Wise
2014-04-16 14:35 ` Sagi Grimberg
2014-04-16 14:43 ` Steve Wise
2014-04-16 15:18 ` Sagi Grimberg
2014-04-16 15:46 ` Steve Wise
2014-04-16 15:08 ` Chuck Lever
2014-04-16 15:23 ` Sagi Grimberg
2014-04-16 18:21 ` Chuck Lever [this message]
2014-04-17 7:06 ` Sagi Grimberg
2014-04-17 13:55 ` Chuck Lever
2014-04-17 14:34 ` Steve Wise
2014-04-17 19:11 ` Sagi Grimberg
2014-04-19 16:31 ` Chuck Lever
2014-04-20 12:42 ` Sagi Grimberg
2014-04-17 19:08 ` Sagi Grimberg
2014-04-14 22:23 ` [PATCH 8/8] xprtrdma: Reduce the number of hardway buffer allocations Chuck Lever
2014-04-15 20:15 ` [PATCH 0/8] NFS/RDMA patches for review Steve Wise
2014-04-15 20:20 ` Steve Wise
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=FC61F219-ACA2-4CE8-BF02-1B8EE2464639@oracle.com \
--to=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=sagig@dev.mellanox.co.il \
--cc=swise@opengridcomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).