public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Naveen Gupta <ngupta@google.com>
Cc: linux-kernel@vger.kernel.org, jens.axboe@oracle.com,
	akpm@linux-foundation.org, s-uchida@ap.jp.nec.com
Subject: Re: [PATCH] Priorities in Anticipatory I/O scheduler
Date: Thu, 30 Oct 2008 08:33:24 +1100	[thread overview]
Message-ID: <20081029213324.GG17077@disturbed> (raw)
In-Reply-To: <2846be6b0810290149j1330b084sf98cf8913d5640e0@mail.gmail.com>

On Wed, Oct 29, 2008 at 01:49:49AM -0700, Naveen Gupta wrote:
> 2008/10/28 Dave Chinner <david@fromorbit.com>:
> >> Now the initial feedback was since this *implementation* is different
> >> from anything we have in CFQ which is our current *standard* way of
> >> thinking and comparing (that is the only thing that exists) why not
> >> make them into a new class :).
> >
> > Because it make it impossible to optimise application code as the
> > class that needs to be used is entirely dependent on the
> > configuration of the machine that it is running on. Application
> > writers are not going to probe the I/O scheduler the block device
> > is using to determine if they should be using RT or LATENCY class
> > prioritisation. From a user POV they do *exactly the same thing*,
> > so they should use the same behavioural classes defined by the API.
> 
> I agree with you that we need an API which is valid across schedulers.
> But one has to agree that this sort of thing has it's own limitations.
> We are assuming that every scheduler which implements any kind of
> priority has a valid implementation of RT, BE, Idle class, which in
> this we we don't have. What happens tomorrow once we have a scheduler
> which decides that it needs to divide b/w. Which class would one map
> it to?

Throttling does not belong in the elevator. It can be successfully
done generically above the elevator in DM. See the dm-ioband
patches, for example.  The elevator is for optimising scheduling of
issued I/O, not controlling every aspect of the I/O path.

> As I understand what you are asking for is: filesystem i/o can use BE
> 0 across all schedulers for journal updates. And you still have RT
> levels to take care of any higher priority i/o which need not wait for
> journal updates.

No, I wanted to use the very highest priority available for the
journal updates. The folk using the real-time priority class didn't
like that, and suggested that the highest BE priority would be
better so journal I/O didn't preempt their RT data I/O. So what I'm
saying is based on feedback from ppl actually using the RT class for
their RT applications...

This is what I've ben trying to tell you and I have so far been
unsuccessful at getting through to you - there are ppl using
this API because it's exposed to userspace so we can't just change it
whenever someone feels like it.

> Here is what we can do:
> 1. Add 17 levels. top 8 RT, next 8 BE and last 1 idle. Though we know
> they all are similar in implementation. It's just that RT > BE > idle
> in importance.

Yes, just like CPU scheduling. We had a RT class there long before
we could really do RT scheduling. Also, nobody suggested introducing
a new "latency class" to the CPU scheduler to fix problems with the
RT scheduling - they fixed the scheduler instead and the API did not
change. We should be following the exact same model for I/O
scheduling priorities.

> And if the LATENCY camp is still active, add another
> class LATENCY which in context of AS is same as RT. So you get to keep
> RT > BE and they get Latency.

Just drop the whole "latency" idea altogether - it's just
another way of saying "use an rt-like priority mechanism", which
we already have a class for.

> 2. Add 10 levels instead of current 8. top 1 level maps all 8 RT
> levels. next 8 are BE and last 1 maps to idle. This also gives you
> access to BE 0, while all RT levels are higher priority than BE. It
> discourages people from using different RT levels unless we find a new
> meaning for it in context of AS.

That doesn't seem like a very good idea to me - RT is there, ppl are
using it, so not supporting it means that the ppl who really care
about I/O latency will continue to avoid using the AS scheduler...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2008-10-29 21:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20081027190131.070061000@elf.corp.google.com>
2008-10-27 19:01 ` [PATCH] Priorities in Anticipatory I/O scheduler ngupta
2008-10-28  0:20   ` Dave Chinner
2008-10-28 17:14     ` Naveen Gupta
2008-10-28 21:44       ` Dave Chinner
2008-10-28 22:48         ` Naveen Gupta
2008-10-28 23:31           ` Dave Chinner
2008-10-29  0:04             ` Naveen Gupta
2008-10-29  0:31               ` Aaron Carroll
2008-10-29  1:17                 ` Naveen Gupta
2008-10-29  2:05                   ` Aaron Carroll
2008-10-29  8:53                     ` Naveen Gupta
2008-10-29  4:05               ` Dave Chinner
2008-10-29  8:49                 ` Naveen Gupta
2008-10-29 21:33                   ` Dave Chinner [this message]
     [not found] <20080706220551.136430000@elf.corp.google.com>
2008-07-06 22:05 ` ngupta
2008-07-07  3:51   ` Aaron Carroll
2008-07-10 18:52     ` Naveen Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081029213324.GG17077@disturbed \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ngupta@google.com \
    --cc=s-uchida@ap.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox