From: Jens Axboe <jens.axboe@oracle.com>
To: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
Eric.Moore@lsi.com, jeff@garzik.org
Subject: Re: [PATCH 1/3] block: add blk-iopoll, a NAPI like approach for block devices
Date: Fri, 7 Aug 2009 08:37:45 +0200 [thread overview]
Message-ID: <20090807063745.GQ12579@kernel.dk> (raw)
In-Reply-To: <20090806223257.0c33cf15@lxorguk.ukuu.org.uk>
On Thu, Aug 06 2009, Alan Cox wrote:
> > doing the command completion when the irq occurs, schedule a dedicated
> > softirq in the hopes that we will complete more IO when the iopoll
> > handler is invoked. Devices have a budget of commands assigned, and will
> > stay in polled mode as long as they continue to consume their budget
> > from the iopoll softirq handler. If they do not, the device is set back
> > to interrupt completion mode.
>
> This seems a little odd for pure ATA except for NCQ commands. Normal ATA
> is notoriously completion/reissue latency sensitive [to the point I
> suspect we should be dequeuing 2 commands from SCSI and loading the next
> in the completion handler as soon as we recover the result task file and
> see no error rather than going up and down the stack)
Yes certainly, it's only for devices that do queuing. If they don't,
then we will always have just the one command to complete. So not much
to poll! As to pre-prep for extra latency intensive devices, have you
tried experimenting with just pretending that non-ncq devices in libata
have a queue depth of 2? That should ensure that the first command
available upon completion of the existing command is already prepped.
Not sure how much time that would save, I would hope that our prep phase
isn't too slow to begin with (or that would be the place to fix :-)
> What do the numbers look like ?
On a slow box (with many cores), the benefits are quite huge:
blocksize blk-iopoll IOPS IRQ/sec Commands/IRQ
--------------------------------------------------------------------
512b 0 25168 ~19500 1,3
512b 1 30355 ~750 40
4096b 0 25612 ~21500 1,2
4096b 1 30231 ~1200 25
I suspect there's some cache interaction going on here too, but the
numbers do look very good. On a faster box (and different architecture),
on a test that does 50k IOPS, they perform identically but the iopoll
approach uses less CPU. The interrupt rate drops from 55k ints/sec to
39-40k ints/sec for that case.
These are all synthetic IO only benchmarks, I hope to have some numbers
for some mixed benchmarks soon too.
> > This patch holds the core bits for blk-iopoll, device driver support
> > sold separately.
>
> You've been at Oracle too long ;) You'll be telling me its not a
> supported configuration next.
;-)
--
Jens Axboe
next prev parent reply other threads:[~2009-08-07 6:37 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-06 19:58 [PATCH 0/3]: blk-iopoll, a polled completion API for block devices Jens Axboe
2009-08-06 19:58 ` [PATCH 1/3] block: add blk-iopoll, a NAPI like approach " Jens Axboe
2009-08-06 21:32 ` Alan Cox
2009-08-07 6:37 ` Jens Axboe [this message]
2009-08-07 8:38 ` Jeff Garzik
2009-08-07 8:50 ` Jens Axboe
2009-08-07 11:05 ` Jens Axboe
2009-08-07 11:31 ` Jens Axboe
2009-08-19 19:08 ` Jens Axboe
2009-08-20 11:30 ` [PATCH 1/3] block: add blk-iopoll, a NAPI like approach forblock devices jack wang
2009-08-20 11:38 ` Jens Axboe
2009-08-06 19:58 ` [PATCH 2/3] libata: add support for blk-iopoll Jens Axboe
2009-08-10 17:15 ` Jonathan Corbet
2009-08-10 17:22 ` Jens Axboe
2009-08-06 19:58 ` [PATCH 3/3] mptfusion: " Jens Axboe
2009-08-11 10:35 ` [PATCH 0/3]: blk-iopoll, a polled completion API for block devices Bart Van Assche
2009-08-11 14:39 ` Jens Axboe
2009-08-11 14:59 ` Bart Van Assche
2009-08-11 17:14 ` Jens Axboe
2009-08-11 18:37 ` Bart Van Assche
2009-08-11 18:41 ` Jens Axboe
2009-08-11 18:49 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090807063745.GQ12579@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=Eric.Moore@lsi.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=jeff@garzik.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).