All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Chad Talbott <ctalbott@google.com>
Cc: jaxboe@fusionio.com, guijianfeng@cn.fujitsu.com,
	mrubin@google.com, teravest@google.com, jmoyer@redhat.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Avoid preferential treatment of groups that aren't backlogged
Date: Wed, 9 Feb 2011 22:57:38 -0500	[thread overview]
Message-ID: <20110210035738.GC27040@redhat.com> (raw)
In-Reply-To: <AANLkTi=dZDK6VosLX8JH4gJyRQ0KXGKMA5vhJ-7D93s7@mail.gmail.com>

On Wed, Feb 09, 2011 at 06:45:25PM -0800, Chad Talbott wrote:
> On Wed, Feb 9, 2011 at 6:09 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > In upstream code once a group gets backlogged we put it at the end
> > and not at the beginning of the tree. (I am wondering are you looking
> > at the google internal code :-))
> >
> > So I don't think that issue of a low weight group getting more disk
> > time than its fair share is present in upstream kernels.
> 
> You've caught me re-using a commit description.  :)
> 
> Here's an example of the kind of tests that fail without this patch
> (run via the test that Justin and Akshay have posted):
> 
> 15:35:35 INFO ----- Running experiment 14: 950 rdrand, 50 rdrand.delay10
> 15:35:55 INFO Experiment completed in 20.4 seconds
> 15:35:55 INFO experiment 14 achieved DTFs: 886, 113
> 15:35:55 INFO experiment 14 FAILED: max observed error is 64, allowed is 50
> 
> 15:35:55 INFO ----- Running experiment 15: 950 rdrand, 50 rdrand.delay50
> 15:36:16 INFO Experiment completed in 20.5 seconds
> 15:36:16 INFO experiment 15 achieved DTFs: 891, 108
> 15:36:16 INFO experiment 15 FAILED: max observed error is 59, allowed is 50
> 
> Since this is Jens' unmodified tree, I've had to change
> BLKIO_WEIGHT_MIN to 10 to allow this test to proceed.  We typically
> run many jobs with small weights, and achieve the requested isolation:
> see below results with this patch:
> 
> 14:59:17 INFO ----- Running experiment 14: 950 rdrand, 50 rdrand.delay10
> 14:59:36 INFO Experiment completed in 19.0 seconds
> 14:59:36 INFO experiment 14 achieved DTFs: 947, 52
> 14:59:36 INFO experiment 14 PASSED: max observed error is 3, allowed is 50
> 
> 14:59:36 INFO ----- Running experiment 15: 950 rdrand, 50 rdrand.delay50
> 14:59:55 INFO Experiment completed in 18.5 seconds
> 14:59:55 INFO experiment 15 achieved DTFs: 944, 55
> 14:59:55 INFO experiment 15 PASSED: max observed error is 6, allowed is 50
> 
> As you can see, it's with seeky workloads that come and go from the
> service tree where this patch is required.

I have not look into or run the tests posted by Justin and Akshay. Can you
give more details about these tests.

Are you running with group_isolation=0 or 1. These tests seem to be random
read and if group_isolation=0 (default), then all the random read queues
should go in root group and there will be no service differentiation.

If you ran different random readers in different groups of differnet
weight with group_isolation=1, then there is a case of having service
differentiation. In that case we will idle for 8ms on each group before
we expire the group. So in these test cases are low weight groups not
submitting IO with-in 8ms? Putting a random reader in separate group
with think time > 8, I think is going to hurt a lot because for every
single IO dispatched group is going to weight for 8ms before it is
expired.

So the only case which comes to my mind where this patch can help is 
when there are lots of groups doing IO with different weights. These
groups have think time greater than 8ms and hence get deleted from
service tree. When next time a low weight group has IO, instead of being
put at the end of service tree, it might be put even farther allowing
a higher weight group to get backlogged ahead of it.

Can you run blktrace and verify what's happenig?

Thanks
Vivek

  reply	other threads:[~2011-02-10  3:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-10  1:32 [PATCH] Avoid preferential treatment of groups that aren't backlogged Chad Talbott
2011-02-10  2:09 ` Vivek Goyal
2011-02-10  2:45   ` Chad Talbott
2011-02-10  3:57     ` Vivek Goyal [this message]
2011-02-10 18:57       ` Chad Talbott
2011-02-11  0:36         ` Chad Talbott
2011-02-11 18:15           ` Vivek Goyal
2011-02-18 19:54             ` Vivek Goyal
2011-02-10  4:02 ` Vivek Goyal
2011-02-10 19:06   ` Chad Talbott
2011-02-11 18:30     ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110210035738.GC27040@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=ctalbott@google.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=jaxboe@fusionio.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mrubin@google.com \
    --cc=teravest@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.