RFC: Allow block drivers to poll for I/O instead of sleeping

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: shli@kernel.org (Shaohua Li)
Subject: RFC: Allow block drivers to poll for I/O instead of sleeping
Date: Thu, 4 Jul 2013 09:13:01 +0800	[thread overview]
Message-ID: <20130704011301.GA16906@kernel.org> (raw)
In-Reply-To: <20130620201713.GV8211@linux.intel.com>

On Thu, Jun 20, 2013@04:17:13PM -0400, Matthew Wilcox wrote:
> 
> A paper at FAST2012
> (http://static.usenix.org/events/fast12/tech/full_papers/Yang.pdf) pointed
> out the performance overhead of taking interrupts for low-latency block
> I/Os.  The solution the author investigated was to spin waiting for each
> I/O to complete.  This is inefficient as Linux submits many I/Os which
> are not latency-sensitive, and even when we do submit latency-sensitive
> I/Os (eg swap-in), we frequently submit several I/Os before waiting.
> 
> This RFC takes a different approach, only spinning when we would
> otherwise sleep.  To implement this, I add an 'io_poll' function pointer
> to backing_dev_info.  I include a sample implementation for the NVMe
> driver.  Next, I add an io_wait() function which will call io_poll()
> if it is set.  It falls back to calling io_schedule() if anything goes
> wrong with io_poll() or the task exceeds its timeslice.  Finally, all
> that is left is to judiciously replace calls to io_schedule() with
> calls to io_wait().  I think I've covered the main contenders with
> sleep_on_page(), sleep_on_buffer() and the DIO path.
> 
> I've measured the performance benefits of this with a Chatham NVMe
> prototype device and a simple
> # dd if=/dev/nvme0n1 of=/dev/null iflag=direct bs=512 count=1000000
> The latency of each I/O reduces by about 2.5us (from around 8.0us to
> around 5.5us).  This matches up quite well with the performance numbers
> shown in the FAST2012 paper (which used a similar device).

Hi Matthew,
I'm wondering where the 2.5us latency cut comes from. I did a simple test. In
my xeon 3.4G CPU, one cpu can do about 2M/s context switch of applications.
Assuming switching to idle is faster, so switching to idle and back should take
less than 1us. Does the 2.5us latency cut mostly come from deep idle state
latency? if so, maybe set a lower pm_qos value or have a better idle governer
to prevent cpu entering deep idle state can help too.

Thanks,
Shaohua

     prev parent reply	other threads:[~2013-07-04  1:13 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-20 20:17 RFC: Allow block drivers to poll for I/O instead of sleeping Matthew Wilcox
2013-06-23 10:09 ` Ingo Molnar
2013-06-23 18:29   ` Linus Torvalds
2013-06-24  7:17     ` Jens Axboe
2013-06-25  0:11       ` Steven Rostedt
2013-06-25  3:07         ` Matthew Wilcox
2013-06-25 13:57           ` Steven Rostedt
2013-06-25 14:57         ` Jens Axboe
2013-06-24  8:07     ` Ingo Molnar
2013-06-25  3:18       ` Matthew Wilcox
2013-06-25  7:07         ` Bart Van Assche
2013-06-25 15:00         ` Jens Axboe
2013-06-27 18:10     ` Rik van Riel
2013-06-23 22:14   ` David Ahern
2013-06-24  8:21     ` Ingo Molnar
2013-06-24  7:15   ` Jens Axboe
2013-06-24  8:18     ` Ingo Molnar
2013-06-25  3:01     ` Matthew Wilcox
2013-06-25 14:55       ` Jens Axboe
2013-06-27 18:42 ` Rik van Riel
2013-07-04  1:13 ` Shaohua Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130704011301.GA16906@kernel.org \
    --to=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).