linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: linux kernel mailing list <linux-kernel@vger.kernel.org>,
	Moyer Jeff Moyer <jmoyer@redhat.com>
Subject: Re: [RFC PATCH] block: Change default IO scheduler to deadline except SATA
Date: Tue, 10 Apr 2012 20:41:08 +0200	[thread overview]
Message-ID: <4F847EC4.7040604@kernel.dk> (raw)
In-Reply-To: <20120410151042.GH21801@redhat.com>

On 2012-04-10 17:10, Vivek Goyal wrote:
> On Tue, Apr 10, 2012 at 10:21:48AM -0400, Vivek Goyal wrote:
>> On Tue, Apr 10, 2012 at 03:56:39PM +0200, Jens Axboe wrote:
>>> On 2012-04-10 15:37, Vivek Goyal wrote:
>>>> Hi,
>>>>
>>>> I am wondering if CFQ as default scheduler is still the right choice. CFQ
>>>> generally works well on slow rotational media (SATA?). But often
>>>> underperforms on faster storage (storage arrays, PCIE SSDs, virtualized
>>>> disk in linux guests etc). People often put logic in user space to tune their
>>>> systems and change IO scheduler to deadline to get better performance on
>>>> faster storage.
>>>>
>>>> Though there is not one good answer for all kind of storage and for all
>>>> kind of workloads, I am wondering if we can provide a better default and
>>>> that is change default IO scheduler to "deadline" except SATA.
>>>>
>>>> One can argue that some SAS disks can be slow too and benefit from CFQ. Yes,
>>>> but default IO scheduler choice is not perfect anyway. It just tries to
>>>> cater to a wide variety of use cases out of the box.
>>>>
>>>> So I am throwing this patch out see if it flies. Personally, I think it
>>>> might turn out to be a more reasonable default.
>>>
>>> I think it'd be a lot more sane to just use CFQ on rotational single
>>> devices, and default to deadline on raid or non-rotational devices. This
>>> still isn't perfect, since less worthy SSDs still benefit from the
>>> read/write separation, and some multi device configs will be faster as
>>> well. But it's better.
>>
>> Hi Jens,
>>
>> Thanks. Taking a decision based on rotational flag makes sense. I am
>> not sure that does one get the information that a block device is a single
>> device or not. Especially with HBAs, SCSI Luns over Fiber, iSCSI Luns etc.
>> I have few Scsi Luns exported to me backed by a storage array. Everything
>> runs CFQ by default. And though disks in the array are rotational, they
>> are RAIDed and AFAIK, this information is not available to driver.
>>
>> I am not sure if there is an easy way to get similar info for dm/md devices.
> 
> Thinking more about it, even if we have a way to define a request queue
> flag for multi devices (QUEUE_FLAG_MULTI_DEVICE), when can block layer
> take a decision to change the IO scheduler. At queue alloc and init time
> driver might not have even called add_disk() or set all the
> flags/properties of the queue. So doing it at queue alloc/init time might
> not be best.
> 
> And later we get control only when actual IO happens on the queue and
> doing one more check or trying to change elevator in IO path is not a
> good idea.
> 
> May be when driver tries to set ROTATIONAL or MULTI_DEVICE flag, we can
> check and change elevator then.
> 
> So we are back to the question of can scsi devices find out if a Lun
> is backed by single disk or multiple disks.

The cleanest would be to have the driver signal these attributes at
probe time. You could even adjust CFQ properties based on this, driving
the queue depth harder etc. Realistically, going forward, most fast
flash devices will be driven by a noop-like scheduler on multiqueue. So
CPU cost of the IO scheduler can mostly be ignored, since CFQ cost on
even big RAIDs isn't an issue due to the low IOPS rates.

-- 
Jens Axboe


  parent reply	other threads:[~2012-04-10 18:41 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-10 13:37 [RFC PATCH] block: Change default IO scheduler to deadline except SATA Vivek Goyal
2012-04-10 13:56 ` Jens Axboe
2012-04-10 14:21   ` Vivek Goyal
2012-04-10 15:10     ` Vivek Goyal
2012-04-10 16:13       ` Mike Snitzer
2012-04-10 17:28         ` Vivek Goyal
2012-04-10 17:40           ` Mike Snitzer
2012-04-10 18:36         ` Jens Axboe
2012-04-11 16:25         ` Martin K. Petersen
2012-04-10 18:41       ` Jens Axboe [this message]
2012-04-10 18:53         ` Vivek Goyal
2012-04-10 18:56           ` Jens Axboe
2012-04-10 19:11             ` Vivek Goyal
2012-04-10 19:19               ` Jens Axboe
2012-04-10 19:43                 ` Mike Snitzer
2012-04-10 19:55                   ` Jens Axboe
2012-04-10 20:12                     ` Mike Snitzer
  -- strict thread matches above, loose matches on Subject: below --
2012-04-10 17:44 Xose Vazquez Perez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F847EC4.7040604@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).