All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Chad Talbott <ctalbott@google.com>
Cc: jaxboe@fusionio.com, guijianfeng@cn.fujitsu.com,
	mrubin@google.com, teravest@google.com, jmoyer@redhat.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Avoid preferential treatment of groups that aren't backlogged
Date: Fri, 11 Feb 2011 13:15:33 -0500	[thread overview]
Message-ID: <20110211181533.GG8773@redhat.com> (raw)
In-Reply-To: <AANLkTimdOO2qvafZ9f6csjqhh1e5w1iOEybN+1-3G3ep@mail.gmail.com>

On Thu, Feb 10, 2011 at 04:36:25PM -0800, Chad Talbott wrote:
> On Thu, Feb 10, 2011 at 10:57 AM, Chad Talbott <ctalbott@google.com> wrote:
> > On Wed, Feb 9, 2011 at 7:57 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> >> If you ran different random readers in different groups of differnet
> >> weight with group_isolation=1, then there is a case of having service
> >> differentiation. In that case we will idle for 8ms on each group before
> >> we expire the group. So in these test cases are low weight groups not
> >> submitting IO with-in 8ms? Putting a random reader in separate group
> >> with think time > 8, I think is going to hurt a lot because for every
> >> single IO dispatched group is going to weight for 8ms before it is
> >> expired.
> >
> > You're right about the behavior of group_idle.  We have more
> > experience with earlier kernels (before group_idle).  With this patch
> > we are able to achieve isolation without group_idle even with these
> > large ratios.  (Without group_idle the random reader workloads will
> > get marked seeky, and idling is disabled.  Without group_idle, we have
> > to remember the vdisktime to get isolation.)
> >
> >> Can you run blktrace and verify what's happenig?
> >
> > I can run a blktrace, and I think it will show what you expect.
> 
> So, I ran the following two tests and took a blktrace.
> 
> 950 rdrand, 50 rdrand.delay10
> weight 950 random reader with low think time vs weight 50 random
> reader with 10ms think time
> 
> 950 rdrand, 50 rdrand.delay50 # 50ms think time
> weight 950 random reader with low think time vs weight 50 random
> reader with 50ms think time
> 
> I find that we are still idling for these random readers, even the one
> with 50ms think time.  group_idle is 0 according to blktrace.
> 
> With this patch, both of these cases have correct isolation.  Without
> this patch, the small weight reader is able to get more than its
> share.
> 
> I think that idling for a random reader with a 50ms think time is
> likely a bug, but a separate issue.

Thanks for checking this out. I agree that for a low weight random
reader/writer which high think time, we need to remember the vdisktime
otherwise it will showup as a fresh new candidate and get more done.

Having said that, one can say that random reader/writer doing small
amount of IO should be able to get job done really fast and the one
who are hogging the disk for long time, should get higher vdisktime.

So with this scheme, a random reader/writer shall have to be of higher
weight to get the job done fast. A low weight reader/writer will still
get higher vdisktime and get lesser share. I think it is reasonable.

And yes, even with group_idle=0 if we are idling on a 50ms thinktime
random reader it sounds like a bug.

Thanks
Vivek

  reply	other threads:[~2011-02-11 18:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-10  1:32 [PATCH] Avoid preferential treatment of groups that aren't backlogged Chad Talbott
2011-02-10  2:09 ` Vivek Goyal
2011-02-10  2:45   ` Chad Talbott
2011-02-10  3:57     ` Vivek Goyal
2011-02-10 18:57       ` Chad Talbott
2011-02-11  0:36         ` Chad Talbott
2011-02-11 18:15           ` Vivek Goyal [this message]
2011-02-18 19:54             ` Vivek Goyal
2011-02-10  4:02 ` Vivek Goyal
2011-02-10 19:06   ` Chad Talbott
2011-02-11 18:30     ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110211181533.GG8773@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=ctalbott@google.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=jaxboe@fusionio.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mrubin@google.com \
    --cc=teravest@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.