All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Hirokazu Takahashi <taka@valinux.co.jp>
Cc: ryov@valinux.co.jp, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, containers@lists.linux-foundation.org,
	virtualization@lists.linux-foundation.org,
	xen-devel@lists.xensource.com, fernando@oss.ntt.co.jp,
	balbir@linux.vnet.ibm.com, xemul@openvz.org, agk@sourceware.org,
	righi.andrea@gmail.com, jens.axboe@oracle.com
Subject: Re: dm-ioband + bio-cgroup benchmarks
Date: Fri, 19 Sep 2008 09:10:19 -0400	[thread overview]
Message-ID: <20080919131019.GA3606@redhat.com> (raw)
In-Reply-To: <20080919.202031.86647893.taka@valinux.co.jp>

On Fri, Sep 19, 2008 at 08:20:31PM +0900, Hirokazu Takahashi wrote:
> Hi,
> 
> > > Hi All,
> > > 
> > > I have got excellent results of dm-ioband, that controls the disk I/O
> > > bandwidth even when it accepts delayed write requests.
> > > 
> > > In this time, I ran some benchmarks with a high-end storage. The
> > > reason was to avoid a performance bottleneck due to mechanical factors
> > > such as seek time.
> > > 
> > > You can see the details of the benchmarks at:
> > > http://people.valinux.co.jp/~ryov/dm-ioband/hps/
> 
>   (snip)
> 
> > Secondly, why do we have to create an additional dm-ioband device for 
> > every device we want to control using rules. This looks little odd
> > atleast to me. Can't we keep it in line with rest of the controllers
> > where task grouping takes place using cgroup and rules are specified in
> > cgroup itself (The way Andrea Righi does for io-throttling patches)?
> 
> It isn't essential dm-band is implemented as one of the device-mappers.
> I've been also considering that this algorithm itself can be implemented
> in the block layer directly.
> 
> Although, the current implementation has merits. It is flexible.
>   - Dm-ioband can be place anywhere you like, which may be right before
>     the I/O schedulers or may be placed on top of LVM devices.

Hi,

An rb-tree per request queue also should be able to give us this
flexibility. Because logic is implemented per request queue, rules can be 
placed at any layer. Either at bottom most layer where requests are
passed to elevator or at higher layer where requests will be passed to 
lower level block devices in the stack. Just that we shall have to do
modifications to some of the higher level dm/md drivers to make use of
queuing cgroup requests and releasing cgroup requests to lower layers.


>   - It supports partition based bandwidth control which can work without
>     cgroups, which is quite easy to use of.

>   - It is independent to any I/O schedulers including ones which will
>     be introduced in the future.

This scheme should also be independent of any of the IO schedulers. We
might have to do small changes in IO-schedulers to decouple the things
from __make_request() a bit to insert rb-tree in between __make_request()
and IO-scheduler. Otherwise fundamentally, this approach should not
require any major modifications to IO-schedulers. 

> 
> I also understand it's will be hard to set up without some tools
> such as lvm commands.
> 

That's something I wish to avoid. If we can keep it simple by doing
grouping using cgroup and allow one line rules in cgroup it would be nice.

> > To avoid creation of stacking another device (dm-ioband) on top of every
> > device we want to subject to rules, I was thinking of maintaining an
> > rb-tree per request queue. Requests will first go into this rb-tree upon
> > __make_request() and then will filter down to elevator associated with the
> > queue (if there is one). This will provide us the control of releasing
> > bio's to elevaor based on policies (proportional weight, max bandwidth
> > etc) and no need of stacking additional block device.
> 
> I think it's a bit late to control I/O requests there, since process
> may be blocked in get_request_wait when the I/O load is high.
> Please imagine the situation that cgroups with low bandwidths are
> consuming most of "struct request"s while another cgroup with a high
> bandwidth is blocked and can't get enough "struct request"s.
> 
> It means cgroups that issues lot of I/O request can win the game.
> 

Ok, this is a good point. Because number of struct requests are limited
and they seem to be allocated on first come first serve basis, so if a
cgroup is generating lot of IO, then it might win.

But dm-ioband will face the same issue. Essentially it is also a request
queue and it will have limited number of request descriptors. Have you 
modified the logic somewhere for allocation of request descriptors to the
waiting processes based on their weights? If yes, the logic probably can
be implemented here too.

Thanks
Vivek

  parent reply	other threads:[~2008-09-19 13:10 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-18 12:04 dm-ioband + bio-cgroup benchmarks Ryo Tsuruta
2008-09-18 13:15 ` Vivek Goyal
     [not found] ` <20080918.210418.226794540.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-18 13:15   ` Vivek Goyal
2008-09-19  8:49   ` Takuya Yoshikawa
2008-09-18 13:15 ` Vivek Goyal
2008-09-18 14:37   ` Andrea Righi
     [not found]   ` <20080918131554.GB20640-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-18 14:37     ` Andrea Righi
2008-09-19  6:12     ` Hirokazu Takahashi
2008-09-19 11:20     ` Hirokazu Takahashi
2008-09-18 14:37   ` Andrea Righi
2008-09-18 15:06     ` Vivek Goyal
2008-09-18 15:06       ` Vivek Goyal
2008-09-18 15:18       ` Andrea Righi
     [not found]       ` <20080918150634.GH20640-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-18 15:18         ` Andrea Righi
2008-09-18 15:18       ` Andrea Righi
2008-09-18 15:18         ` Andrea Righi
2008-09-18 16:20         ` Vivek Goyal
2008-09-18 16:20           ` Vivek Goyal
     [not found]           ` <20080918162010.GJ20640-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-18 19:54             ` Andrea Righi
2008-09-18 19:54           ` Andrea Righi
2008-09-18 19:54           ` Andrea Righi
2008-09-19  3:34         ` [dm-devel] " Hirokazu Takahashi
2008-09-19  3:34         ` Hirokazu Takahashi
2008-09-19  3:34           ` Hirokazu Takahashi
2008-09-20  4:27           ` KAMEZAWA Hiroyuki
2008-09-20  5:18             ` Balbir Singh
     [not found]             ` <20080920132703.e74c8f89.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-09-20  5:18               ` Balbir Singh
2008-09-20  5:18             ` Balbir Singh
     [not found]               ` <48D48789.8000606-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-09-20  9:25                 ` KAMEZAWA Hiroyuki
2008-09-20  9:25               ` KAMEZAWA Hiroyuki
2008-09-20  9:25               ` KAMEZAWA Hiroyuki
2008-09-20  4:27           ` KAMEZAWA Hiroyuki
2008-09-24 11:04           ` [Xen-devel] " Balbir Singh
2008-09-24 11:07             ` [Xen-devel] Re: [dm-devel] " Balbir Singh
2008-09-24 11:07             ` Balbir Singh
2008-09-24 11:07               ` [Xen-devel] " Balbir Singh
2008-09-26 10:54               ` Hirokazu Takahashi
2008-09-26 10:54               ` Hirokazu Takahashi
2008-09-26 10:54                 ` [Xen-devel] " Hirokazu Takahashi
     [not found]               ` <661de9470809240407m7f50b6dav897fef3b37295bb2-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-09-26 10:54                 ` Hirokazu Takahashi
     [not found]             ` <661de9470809240404i62300942o15337ecec335fe22-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-09-24 11:07               ` Balbir Singh
2008-09-24 11:04           ` Balbir Singh
     [not found]           ` <20080919.123405.91829935.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-20  4:27             ` KAMEZAWA Hiroyuki
2008-09-24 11:04             ` [Xen-devel] " Balbir Singh
     [not found]         ` <48D2715A.6060002-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-18 16:20           ` Vivek Goyal
2008-09-19  3:34           ` [dm-devel] " Hirokazu Takahashi
     [not found]     ` <48D267B5.20402-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-18 15:06       ` Vivek Goyal
2008-09-19  6:12   ` Hirokazu Takahashi
2008-09-19  6:12   ` Hirokazu Takahashi
2008-09-19  6:12     ` Hirokazu Takahashi
2008-09-19 13:12     ` Vivek Goyal
     [not found]     ` <20080919.151221.49666828.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-19 13:12       ` Vivek Goyal
2008-09-19 13:12     ` Vivek Goyal
2008-09-19 11:20   ` Hirokazu Takahashi
2008-09-19 11:20   ` Hirokazu Takahashi
2008-09-19 11:20     ` Hirokazu Takahashi
     [not found]     ` <20080919.202031.86647893.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-19 13:10       ` Vivek Goyal
2008-09-19 13:10     ` Vivek Goyal [this message]
2008-09-19 20:28       ` Andrea Righi
     [not found]       ` <20080919131019.GA3606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-19 20:28         ` Andrea Righi
2008-09-19 20:28           ` Andrea Righi
     [not found]           ` <48D40B78.6060709-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-22  9:45             ` Hirokazu Takahashi
2008-09-22  9:45           ` Hirokazu Takahashi
2008-09-22  9:45           ` Hirokazu Takahashi
2008-09-22  9:45             ` Hirokazu Takahashi
2008-09-22  9:36         ` Hirokazu Takahashi
2008-09-22  9:36       ` Hirokazu Takahashi
2008-09-22  9:36         ` Hirokazu Takahashi
     [not found]         ` <20080922.183651.62951479.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-22 14:30           ` Vivek Goyal
2008-09-22 14:30         ` Vivek Goyal
2008-09-22 14:30         ` Vivek Goyal
2008-09-24  8:29           ` Hirokazu Takahashi
2008-09-24  8:29             ` Hirokazu Takahashi
2008-09-24 14:03             ` Vivek Goyal
2008-09-24 14:03               ` Vivek Goyal
2008-09-26 16:11               ` Andrea Righi
     [not found]               ` <20080924140355.GB547-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-26 16:11                 ` Andrea Righi
2008-09-26 16:11               ` Andrea Righi
2008-09-26 17:11                 ` Andrea Righi
2008-09-26 17:30                   ` Andrea Righi
2008-09-26 17:30                   ` Andrea Righi
2008-09-29 12:07                   ` Hirokazu Takahashi
     [not found]                   ` <48DD17A9.9080607-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-26 17:30                     ` Andrea Righi
2008-09-29 12:07                     ` Hirokazu Takahashi
2008-09-29 12:07                   ` Hirokazu Takahashi
2008-09-29 12:07                     ` Hirokazu Takahashi
2008-09-29 12:13                     ` Pavel Emelyanov
     [not found]                     ` <20080929.210729.117112710.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-29 12:13                       ` Pavel Emelyanov
2008-09-29 12:13                     ` Pavel Emelyanov
2008-09-26 17:11                 ` Andrea Righi
     [not found]                 ` <48DD09AD.2010200-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-26 17:11                   ` Andrea Righi
     [not found]             ` <20080924.172937.72827863.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-24 14:03               ` Vivek Goyal
2008-09-24 14:03             ` Vivek Goyal
2008-09-24  8:29           ` Hirokazu Takahashi
2008-09-24 10:18           ` Hirokazu Takahashi
2008-09-24 10:18           ` Hirokazu Takahashi
2008-09-24 10:18             ` Hirokazu Takahashi
     [not found]             ` <20080924.191803.100102323.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-24 14:52               ` Vivek Goyal
2008-09-24 14:52             ` Vivek Goyal
2008-09-24 14:52               ` Vivek Goyal
     [not found]               ` <20080924145202.GC547-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-26 12:42                 ` Hirokazu Takahashi
2008-09-26 12:42               ` Hirokazu Takahashi
2008-09-26 12:42                 ` Hirokazu Takahashi
2008-09-26 12:42               ` Hirokazu Takahashi
2008-09-24 14:52             ` Vivek Goyal
2008-09-24 10:34           ` Hirokazu Takahashi
2008-09-24 10:34             ` Hirokazu Takahashi
     [not found]             ` <20080924.193414.22923673.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-24 12:38               ` Balbir Singh
2008-09-24 14:53               ` Vivek Goyal
2008-09-24 12:38             ` Balbir Singh
2008-09-24 12:38             ` Balbir Singh
2008-09-24 14:53             ` Vivek Goyal
2008-09-24 14:53               ` Vivek Goyal
2008-09-26 13:04               ` Hirokazu Takahashi
2008-09-26 13:04               ` Hirokazu Takahashi
2008-09-26 13:04                 ` Hirokazu Takahashi
     [not found]                 ` <20080926.220418.83079316.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-26 15:56                   ` Andrea Righi
2008-09-26 15:56                 ` Andrea Righi
2008-09-26 15:56                 ` Andrea Righi
2008-09-26 15:56                   ` Andrea Righi
     [not found]                   ` <48DD0617.3050403-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-29 10:40                     ` Hirokazu Takahashi
2008-09-29 10:40                   ` Hirokazu Takahashi
2008-09-29 10:40                     ` Hirokazu Takahashi
2008-09-29 10:40                   ` Hirokazu Takahashi
     [not found]               ` <20080924145331.GD547-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-26 13:04                 ` Hirokazu Takahashi
2008-09-24 14:53             ` Vivek Goyal
2008-09-24 10:34           ` Hirokazu Takahashi
     [not found]           ` <20080922143042.GA19222-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-24  8:29             ` Hirokazu Takahashi
2008-09-24 10:18             ` Hirokazu Takahashi
2008-09-24 10:34             ` Hirokazu Takahashi
2008-09-22  9:36       ` Hirokazu Takahashi
2008-09-19 13:10     ` Vivek Goyal
2008-09-19  8:49 ` Takuya Yoshikawa
2008-09-19  8:49 ` Takuya Yoshikawa
2008-09-19 11:31   ` Ryo Tsuruta
2008-09-19 11:31     ` Ryo Tsuruta
2008-09-19 11:31   ` Ryo Tsuruta
     [not found]   ` <48D36794.6010002-gVGce1chcLdL9jVzuh4AOg@public.gmane.org>
2008-09-19 11:31     ` Ryo Tsuruta
  -- strict thread matches above, loose matches on Subject: below --
2008-09-18 12:04 Ryo Tsuruta
2008-09-18 12:04 Ryo Tsuruta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080919131019.GA3606@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=agk@sourceware.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=dm-devel@redhat.com \
    --cc=fernando@oss.ntt.co.jp \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=righi.andrea@gmail.com \
    --cc=ryov@valinux.co.jp \
    --cc=taka@valinux.co.jp \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xemul@openvz.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.