From: Johannes Thumshirn <jthumshirn@suse.de>
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Hannes Reinecke <hare@suse.de>, Jens Axboe <axboe@kernel.dk>,
Christoph Hellwig <hch@infradead.org>,
Linux-scsi@vger.kernel.org, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, Keith Busch <keith.busch@intel.com>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>
Subject: Re: [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers
Date: Thu, 19 Jan 2017 10:13:08 +0100 [thread overview]
Message-ID: <20170119091308.GL5054@linux-x5ow.site> (raw)
In-Reply-To: <97a3f3d3-3619-871d-55f3-75449b0c34cf@grimberg.me>
On Thu, Jan 19, 2017 at 10:12:17AM +0200, Sagi Grimberg wrote:
>
> >>>I think you missed:
> >>>http://git.infradead.org/nvme.git/commit/49c91e3e09dc3c9dd1718df85112a8cce3ab7007
> >>
> >>I indeed did, thanks.
> >>
> >But it doesn't help.
> >
> >We're still having to wait for the first interrupt, and if we're really
> >fast that's the only completion we have to process.
> >
> >Try this:
> >
> >
> >diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> >index b4b32e6..e2dd9e2 100644
> >--- a/drivers/nvme/host/pci.c
> >+++ b/drivers/nvme/host/pci.c
> >@@ -623,6 +623,8 @@ static int nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
> > }
> > __nvme_submit_cmd(nvmeq, &cmnd);
> > spin_unlock(&nvmeq->sq_lock);
> >+ disable_irq_nosync(nvmeq_irq(irq));
> >+ irq_poll_sched(&nvmeq->iop);
>
> a. This would trigger a condition that we disable irq twice which
> is wrong at least because it will generate a warning.
>
> b. This would cause a way-too-much triggers of ksoftirqd. In order for
> it to be effective we need to to run only when it should and optimally
> when the completion queue has a batch of completions waiting.
>
> After a deeper analysis, I agree with Bart that interrupt coalescing is
> needed for it to work. The problem with nvme coalescing as Jens said, is
> a death penalty of 100us granularity. Hannes, Johannes, how does it look
> like with the devices you are testing with?
I haven't had a look at AHCI's Command Completion Coalescing yet but hopefully
I find the time today (+SSD testing!!!).
Don't know if Hannes did (but I _think_ no). The problem is we've already
maxed out our test HW w/o irq_poll and so the only changes we're seeing
currently is an increase of wasted CPU cycles. Not what we wanted to have.
>
> Also, I think that adaptive moderation is needed in order for it to
> work well. I know that some networking drivers implemented adaptive
> moderation in SW before having HW support for it. It can be done by
> maintaining stats and having a periodic work that looks at it and
> changes the moderation parameters.
>
> Does anyone think that this is something we should consider?
Yes we've been discussing this internally as well and it sounds good but thats
still all pure theory and nothing actually implemented and tested.
Byte,
Johannes
--
Johannes Thumshirn Storage
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
next prev parent reply other threads:[~2017-01-19 9:13 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-11 13:43 [LSF/MM TOPIC][LSF/MM ATTEND] NAPI polling for block drivers Johannes Thumshirn
2017-01-11 13:46 ` Hannes Reinecke
2017-01-11 15:07 ` Jens Axboe
2017-01-11 15:13 ` Jens Axboe
2017-01-12 8:23 ` Sagi Grimberg
2017-01-12 10:02 ` Johannes Thumshirn
2017-01-12 11:44 ` Sagi Grimberg
2017-01-12 12:53 ` Johannes Thumshirn
2017-01-12 14:41 ` [Lsf-pc] " Sagi Grimberg
2017-01-12 18:59 ` Johannes Thumshirn
2017-01-17 15:38 ` Sagi Grimberg
2017-01-17 15:45 ` Sagi Grimberg
2017-01-20 12:22 ` Johannes Thumshirn
2017-01-17 16:15 ` Sagi Grimberg
2017-01-17 16:27 ` Johannes Thumshirn
2017-01-17 16:38 ` Sagi Grimberg
2017-01-18 13:51 ` Johannes Thumshirn
2017-01-18 14:27 ` Sagi Grimberg
2017-01-18 14:36 ` Andrey Kuzmin
2017-01-18 14:40 ` Sagi Grimberg
2017-01-18 15:35 ` Andrey Kuzmin
2017-01-18 14:58 ` Johannes Thumshirn
2017-01-18 15:14 ` Sagi Grimberg
2017-01-18 15:16 ` Johannes Thumshirn
2017-01-18 15:39 ` Hannes Reinecke
2017-01-19 8:12 ` Sagi Grimberg
2017-01-19 8:23 ` Sagi Grimberg
2017-01-19 9:18 ` Johannes Thumshirn
2017-01-19 9:13 ` Johannes Thumshirn [this message]
[not found] ` <CANvN+emx1-F3iAY45t1_MQRcijw7sf1jPvjwv0uh8A3GzzQwMg@mail.gmail.com>
2017-01-17 16:50 ` Sagi Grimberg
2017-01-18 14:02 ` Hannes Reinecke
2017-01-20 0:13 ` Jens Axboe
2017-01-13 15:56 ` Johannes Thumshirn
2017-01-11 15:16 ` Hannes Reinecke
2017-01-12 4:36 ` Stephen Bates
2017-01-12 4:44 ` Jens Axboe
2017-01-12 4:56 ` Stephen Bates
2017-01-19 10:57 ` Ming Lei
2017-01-19 11:03 ` Hannes Reinecke
2017-01-11 16:08 ` Bart Van Assche
2017-01-11 16:12 ` hch
2017-01-11 16:15 ` Jens Axboe
2017-01-11 16:22 ` Hannes Reinecke
2017-01-11 16:26 ` Bart Van Assche
2017-01-11 16:45 ` Hannes Reinecke
2017-01-12 8:52 ` sagi grimberg
2017-01-11 16:14 ` Johannes Thumshirn
2017-01-12 8:41 ` Sagi Grimberg
2017-01-12 19:13 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170119091308.GL5054@linux-x5ow.site \
--to=jthumshirn@suse.de \
--cc=Linux-scsi@vger.kernel.org \
--cc=axboe@kernel.dk \
--cc=hare@suse.de \
--cc=hch@infradead.org \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox