From: Andrea Righi <righi.andrea@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>, Ryo Tsuruta <ryov@valinux.co.jp>
Cc: xen-devel@lists.xensource.com,
containers@lists.linux-foundation.org, jens.axboe@oracle.com,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, dm-devel@redhat.com,
agk@sourceware.org, xemul@openvz.org, fernando@oss.ntt.co.jp,
balbir@linux.vnet.ibm.com
Subject: Re: dm-ioband + bio-cgroup benchmarks
Date: Thu, 18 Sep 2008 17:18:50 +0200 [thread overview]
Message-ID: <48D2715A.6060002@gmail.com> (raw)
In-Reply-To: <20080918150634.GH20640@redhat.com>
Vivek Goyal wrote:
> On Thu, Sep 18, 2008 at 04:37:41PM +0200, Andrea Righi wrote:
>> Vivek Goyal wrote:
>>> On Thu, Sep 18, 2008 at 09:04:18PM +0900, Ryo Tsuruta wrote:
>>>> Hi All,
>>>>
>>>> I have got excellent results of dm-ioband, that controls the disk I/O
>>>> bandwidth even when it accepts delayed write requests.
>>>>
>>>> In this time, I ran some benchmarks with a high-end storage. The
>>>> reason was to avoid a performance bottleneck due to mechanical factors
>>>> such as seek time.
>>>>
>>>> You can see the details of the benchmarks at:
>>>> http://people.valinux.co.jp/~ryov/dm-ioband/hps/
>>>>
>>> Hi Ryo,
>>>
>>> I had a query about dm-ioband patches. IIUC, dm-ioband patches will break
>>> the notion of process priority in CFQ because now dm-ioband device will
>>> hold the bio and issue these to lower layers later based on which bio's
>>> become ready. Hence actual bio submitting context might be different and
>>> because cfq derives the io_context from current task, it will be broken.
>>>
>>> To mitigate that problem, we probably need to implement Fernando's
>>> suggestion of putting io_context pointer in bio.
>>>
>>> Have you already done something to solve this issue?
>>>
>>> Secondly, why do we have to create an additional dm-ioband device for
>>> every device we want to control using rules. This looks little odd
>>> atleast to me. Can't we keep it in line with rest of the controllers
>>> where task grouping takes place using cgroup and rules are specified in
>>> cgroup itself (The way Andrea Righi does for io-throttling patches)?
>>>
>>> To avoid creation of stacking another device (dm-ioband) on top of every
>>> device we want to subject to rules, I was thinking of maintaining an
>>> rb-tree per request queue. Requests will first go into this rb-tree upon
>>> __make_request() and then will filter down to elevator associated with the
>>> queue (if there is one). This will provide us the control of releasing
>>> bio's to elevaor based on policies (proportional weight, max bandwidth
>>> etc) and no need of stacking additional block device.
>>>
>>> I am working on some experimental proof of concept patches. It will take
>>> some time though.
>>>
>>> I was thinking of following.
>>>
>>> - Adopt the Andrea Righi's style of specifying rules for devices and
>>> group the tasks using cgroups.
>>>
>>> - To begin with, adopt dm-ioband's approach of proportional bandwidth
>>> controller. It makes sense to me limit the bandwidth usage only in
>>> case of contention. If there is really a need to limit max bandwidth,
>>> then probably we can do something to implement additional rules or
>>> implement some policy switcher where user can decide what kind of
>>> policies need to be implemented.
>>>
>>> - Get rid of dm-ioband and instead buffer requests on an rb-tree on every
>>> request queue which is controlled by some kind of cgroup rules.
>>>
>>> It would be good to discuss above approach now whether it makes sense or
>>> not. I think it is kind of fusion of io-throttling and dm-ioband patches
>>> with additional idea of doing io-control just above elevator on the request
>>> queue using an rb-tree.
>> Thanks Vivek. All sounds reasonable to me and I think this is be the right way
>> to proceed.
>>
>> I'll try to design and implement your rb-tree per request-queue idea into my
>> io-throttle controller, maybe we can reuse it also for a more generic solution.
>> Feel free to send me your experimental proof of concept if you want, even if
>> it's not yet complete, I can review it, test and contribute.
>
> Currently I have taken code from bio-cgroup to implement cgroups and to
> provide functionality to associate a bio to a cgroup. I need this to be
> able to queue the bio's at right node in the rb-tree and then also to be
> able to take a decision when is the right time to release few requests.
>
> Right now in crude implementation, I am working on making system boot.
> Once patches are at least in little bit working shape, I will send it to you
> to have a look.
>
> Thanks
> Vivek
I wonder... wouldn't be simpler to just use the memory controller
to retrieve this information starting from struct page?
I mean, following this path (in short, obviously using the appropriate
interfaces for locking and referencing the different objects):
cgrp = page->page_cgroup->mem_cgroup->css.cgroup
Once you get the cgrp it's very easy to use the corresponding controller
structure.
Actually, this is how I'm doing in cgroup-io-throttle to associate a bio
to a cgroup. What other functionalities/advantages bio-cgroup provide in
addition to that?
Thanks,
-Andrea
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Righi <righi.andrea@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>, Ryo Tsuruta <ryov@valinux.co.jp>
Cc: linux-kernel@vger.kernel.org, dm-devel@redhat.com,
containers@lists.linux-foundation.org,
virtualization@lists.linux-foundation.org,
xen-devel@lists.xensource.com, fernando@oss.ntt.co.jp,
balbir@linux.vnet.ibm.com, xemul@openvz.org, agk@sourceware.org,
jens.axboe@oracle.com
Subject: Re: dm-ioband + bio-cgroup benchmarks
Date: Thu, 18 Sep 2008 17:18:50 +0200 [thread overview]
Message-ID: <48D2715A.6060002@gmail.com> (raw)
In-Reply-To: <20080918150634.GH20640@redhat.com>
Vivek Goyal wrote:
> On Thu, Sep 18, 2008 at 04:37:41PM +0200, Andrea Righi wrote:
>> Vivek Goyal wrote:
>>> On Thu, Sep 18, 2008 at 09:04:18PM +0900, Ryo Tsuruta wrote:
>>>> Hi All,
>>>>
>>>> I have got excellent results of dm-ioband, that controls the disk I/O
>>>> bandwidth even when it accepts delayed write requests.
>>>>
>>>> In this time, I ran some benchmarks with a high-end storage. The
>>>> reason was to avoid a performance bottleneck due to mechanical factors
>>>> such as seek time.
>>>>
>>>> You can see the details of the benchmarks at:
>>>> http://people.valinux.co.jp/~ryov/dm-ioband/hps/
>>>>
>>> Hi Ryo,
>>>
>>> I had a query about dm-ioband patches. IIUC, dm-ioband patches will break
>>> the notion of process priority in CFQ because now dm-ioband device will
>>> hold the bio and issue these to lower layers later based on which bio's
>>> become ready. Hence actual bio submitting context might be different and
>>> because cfq derives the io_context from current task, it will be broken.
>>>
>>> To mitigate that problem, we probably need to implement Fernando's
>>> suggestion of putting io_context pointer in bio.
>>>
>>> Have you already done something to solve this issue?
>>>
>>> Secondly, why do we have to create an additional dm-ioband device for
>>> every device we want to control using rules. This looks little odd
>>> atleast to me. Can't we keep it in line with rest of the controllers
>>> where task grouping takes place using cgroup and rules are specified in
>>> cgroup itself (The way Andrea Righi does for io-throttling patches)?
>>>
>>> To avoid creation of stacking another device (dm-ioband) on top of every
>>> device we want to subject to rules, I was thinking of maintaining an
>>> rb-tree per request queue. Requests will first go into this rb-tree upon
>>> __make_request() and then will filter down to elevator associated with the
>>> queue (if there is one). This will provide us the control of releasing
>>> bio's to elevaor based on policies (proportional weight, max bandwidth
>>> etc) and no need of stacking additional block device.
>>>
>>> I am working on some experimental proof of concept patches. It will take
>>> some time though.
>>>
>>> I was thinking of following.
>>>
>>> - Adopt the Andrea Righi's style of specifying rules for devices and
>>> group the tasks using cgroups.
>>>
>>> - To begin with, adopt dm-ioband's approach of proportional bandwidth
>>> controller. It makes sense to me limit the bandwidth usage only in
>>> case of contention. If there is really a need to limit max bandwidth,
>>> then probably we can do something to implement additional rules or
>>> implement some policy switcher where user can decide what kind of
>>> policies need to be implemented.
>>>
>>> - Get rid of dm-ioband and instead buffer requests on an rb-tree on every
>>> request queue which is controlled by some kind of cgroup rules.
>>>
>>> It would be good to discuss above approach now whether it makes sense or
>>> not. I think it is kind of fusion of io-throttling and dm-ioband patches
>>> with additional idea of doing io-control just above elevator on the request
>>> queue using an rb-tree.
>> Thanks Vivek. All sounds reasonable to me and I think this is be the right way
>> to proceed.
>>
>> I'll try to design and implement your rb-tree per request-queue idea into my
>> io-throttle controller, maybe we can reuse it also for a more generic solution.
>> Feel free to send me your experimental proof of concept if you want, even if
>> it's not yet complete, I can review it, test and contribute.
>
> Currently I have taken code from bio-cgroup to implement cgroups and to
> provide functionality to associate a bio to a cgroup. I need this to be
> able to queue the bio's at right node in the rb-tree and then also to be
> able to take a decision when is the right time to release few requests.
>
> Right now in crude implementation, I am working on making system boot.
> Once patches are at least in little bit working shape, I will send it to you
> to have a look.
>
> Thanks
> Vivek
I wonder... wouldn't be simpler to just use the memory controller
to retrieve this information starting from struct page?
I mean, following this path (in short, obviously using the appropriate
interfaces for locking and referencing the different objects):
cgrp = page->page_cgroup->mem_cgroup->css.cgroup
Once you get the cgrp it's very easy to use the corresponding controller
structure.
Actually, this is how I'm doing in cgroup-io-throttle to associate a bio
to a cgroup. What other functionalities/advantages bio-cgroup provide in
addition to that?
Thanks,
-Andrea
next prev parent reply other threads:[~2008-09-18 15:18 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-18 12:04 dm-ioband + bio-cgroup benchmarks Ryo Tsuruta
2008-09-18 13:15 ` Vivek Goyal
2008-09-18 13:15 ` Vivek Goyal
2008-09-18 14:37 ` Andrea Righi
[not found] ` <20080918131554.GB20640-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-18 14:37 ` Andrea Righi
2008-09-19 6:12 ` Hirokazu Takahashi
2008-09-19 11:20 ` Hirokazu Takahashi
2008-09-18 14:37 ` Andrea Righi
[not found] ` <48D267B5.20402-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-18 15:06 ` Vivek Goyal
2008-09-18 15:06 ` Vivek Goyal
2008-09-18 15:06 ` Vivek Goyal
2008-09-18 15:18 ` Andrea Righi [this message]
2008-09-18 15:18 ` Andrea Righi
2008-09-18 16:20 ` Vivek Goyal
2008-09-18 16:20 ` Vivek Goyal
2008-09-18 19:54 ` Andrea Righi
2008-09-18 19:54 ` Andrea Righi
[not found] ` <20080918162010.GJ20640-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-18 19:54 ` Andrea Righi
2008-09-19 3:34 ` [dm-devel] " Hirokazu Takahashi
2008-09-19 3:34 ` Hirokazu Takahashi
2008-09-20 4:27 ` KAMEZAWA Hiroyuki
2008-09-20 5:18 ` Balbir Singh
[not found] ` <48D48789.8000606-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-09-20 9:25 ` KAMEZAWA Hiroyuki
2008-09-20 9:25 ` KAMEZAWA Hiroyuki
2008-09-20 9:25 ` KAMEZAWA Hiroyuki
2008-09-20 5:18 ` Balbir Singh
[not found] ` <20080920132703.e74c8f89.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-09-20 5:18 ` Balbir Singh
2008-09-20 4:27 ` KAMEZAWA Hiroyuki
2008-09-24 11:04 ` [Xen-devel] " Balbir Singh
2008-09-24 11:07 ` [Xen-devel] Re: [dm-devel] " Balbir Singh
2008-09-24 11:07 ` Balbir Singh
2008-09-24 11:07 ` [Xen-devel] " Balbir Singh
[not found] ` <661de9470809240407m7f50b6dav897fef3b37295bb2-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-09-26 10:54 ` Hirokazu Takahashi
2008-09-26 10:54 ` Hirokazu Takahashi
2008-09-26 10:54 ` Hirokazu Takahashi
2008-09-26 10:54 ` [Xen-devel] " Hirokazu Takahashi
[not found] ` <661de9470809240404i62300942o15337ecec335fe22-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-09-24 11:07 ` Balbir Singh
2008-09-24 11:04 ` Balbir Singh
[not found] ` <20080919.123405.91829935.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-20 4:27 ` KAMEZAWA Hiroyuki
2008-09-24 11:04 ` [Xen-devel] " Balbir Singh
[not found] ` <48D2715A.6060002-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-18 16:20 ` Vivek Goyal
2008-09-19 3:34 ` [dm-devel] " Hirokazu Takahashi
2008-09-19 3:34 ` Hirokazu Takahashi
[not found] ` <20080918150634.GH20640-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-18 15:18 ` Andrea Righi
2008-09-18 15:18 ` Andrea Righi
2008-09-19 6:12 ` Hirokazu Takahashi
2008-09-19 6:12 ` Hirokazu Takahashi
2008-09-19 6:12 ` Hirokazu Takahashi
2008-09-19 13:12 ` Vivek Goyal
[not found] ` <20080919.151221.49666828.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-19 13:12 ` Vivek Goyal
2008-09-19 13:12 ` Vivek Goyal
2008-09-19 11:20 ` Hirokazu Takahashi
2008-09-19 11:20 ` Hirokazu Takahashi
2008-09-19 11:20 ` Hirokazu Takahashi
2008-09-19 13:10 ` Vivek Goyal
[not found] ` <20080919.202031.86647893.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-19 13:10 ` Vivek Goyal
2008-09-19 13:10 ` Vivek Goyal
2008-09-19 20:28 ` Andrea Righi
2008-09-22 9:36 ` Hirokazu Takahashi
2008-09-22 9:36 ` Hirokazu Takahashi
2008-09-22 9:36 ` Hirokazu Takahashi
2008-09-22 14:30 ` Vivek Goyal
[not found] ` <20080922.183651.62951479.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-22 14:30 ` Vivek Goyal
2008-09-22 14:30 ` Vivek Goyal
2008-09-24 8:29 ` Hirokazu Takahashi
2008-09-24 8:29 ` Hirokazu Takahashi
2008-09-24 8:29 ` Hirokazu Takahashi
2008-09-24 14:03 ` Vivek Goyal
2008-09-24 14:03 ` Vivek Goyal
2008-09-26 16:11 ` Andrea Righi
[not found] ` <20080924140355.GB547-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-26 16:11 ` Andrea Righi
2008-09-26 16:11 ` Andrea Righi
[not found] ` <48DD09AD.2010200-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-26 17:11 ` Andrea Righi
2008-09-26 17:11 ` Andrea Righi
2008-09-26 17:11 ` Andrea Righi
2008-09-26 17:30 ` Andrea Righi
[not found] ` <48DD17A9.9080607-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-26 17:30 ` Andrea Righi
2008-09-29 12:07 ` Hirokazu Takahashi
2008-09-26 17:30 ` Andrea Righi
2008-09-29 12:07 ` Hirokazu Takahashi
2008-09-29 12:07 ` Hirokazu Takahashi
2008-09-29 12:13 ` Pavel Emelyanov
[not found] ` <20080929.210729.117112710.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-29 12:13 ` Pavel Emelyanov
2008-09-29 12:13 ` Pavel Emelyanov
2008-09-29 12:07 ` Hirokazu Takahashi
2008-09-24 14:03 ` Vivek Goyal
[not found] ` <20080924.172937.72827863.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-24 14:03 ` Vivek Goyal
[not found] ` <20080922143042.GA19222-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-24 8:29 ` Hirokazu Takahashi
2008-09-24 10:18 ` Hirokazu Takahashi
2008-09-24 10:34 ` Hirokazu Takahashi
2008-09-24 10:18 ` Hirokazu Takahashi
2008-09-24 10:18 ` Hirokazu Takahashi
[not found] ` <20080924.191803.100102323.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-24 14:52 ` Vivek Goyal
2008-09-24 14:52 ` Vivek Goyal
2008-09-24 14:52 ` Vivek Goyal
2008-09-24 14:52 ` Vivek Goyal
[not found] ` <20080924145202.GC547-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-26 12:42 ` Hirokazu Takahashi
2008-09-26 12:42 ` Hirokazu Takahashi
2008-09-26 12:42 ` Hirokazu Takahashi
2008-09-26 12:42 ` Hirokazu Takahashi
2008-09-24 10:18 ` Hirokazu Takahashi
2008-09-24 10:34 ` Hirokazu Takahashi
2008-09-24 10:34 ` Hirokazu Takahashi
2008-09-24 10:34 ` Hirokazu Takahashi
2008-09-24 12:38 ` Balbir Singh
2008-09-24 12:38 ` Balbir Singh
2008-09-24 14:53 ` Vivek Goyal
[not found] ` <20080924.193414.22923673.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-24 12:38 ` Balbir Singh
2008-09-24 14:53 ` Vivek Goyal
2008-09-24 14:53 ` Vivek Goyal
2008-09-24 14:53 ` Vivek Goyal
[not found] ` <20080924145331.GD547-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-26 13:04 ` Hirokazu Takahashi
2008-09-26 13:04 ` Hirokazu Takahashi
2008-09-26 13:04 ` Hirokazu Takahashi
2008-09-26 15:56 ` Andrea Righi
[not found] ` <20080926.220418.83079316.taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-26 15:56 ` Andrea Righi
2008-09-26 15:56 ` Andrea Righi
2008-09-26 15:56 ` Andrea Righi
[not found] ` <48DD0617.3050403-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-29 10:40 ` Hirokazu Takahashi
2008-09-29 10:40 ` Hirokazu Takahashi
2008-09-29 10:40 ` Hirokazu Takahashi
2008-09-29 10:40 ` Hirokazu Takahashi
2008-09-26 13:04 ` Hirokazu Takahashi
[not found] ` <20080919131019.GA3606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2008-09-19 20:28 ` Andrea Righi
2008-09-19 20:28 ` Andrea Righi
2008-09-22 9:45 ` Hirokazu Takahashi
2008-09-22 9:45 ` Hirokazu Takahashi
2008-09-22 9:45 ` Hirokazu Takahashi
[not found] ` <48D40B78.6060709-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-09-22 9:45 ` Hirokazu Takahashi
2008-09-22 9:36 ` Hirokazu Takahashi
[not found] ` <20080918.210418.226794540.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2008-09-18 13:15 ` Vivek Goyal
2008-09-19 8:49 ` Takuya Yoshikawa
2008-09-19 8:49 ` Takuya Yoshikawa
2008-09-19 8:49 ` Takuya Yoshikawa
[not found] ` <48D36794.6010002-gVGce1chcLdL9jVzuh4AOg@public.gmane.org>
2008-09-19 11:31 ` Ryo Tsuruta
2008-09-19 11:31 ` Ryo Tsuruta
2008-09-19 11:31 ` Ryo Tsuruta
2008-09-19 11:31 ` Ryo Tsuruta
-- strict thread matches above, loose matches on Subject: below --
2008-09-18 12:04 Ryo Tsuruta
2008-09-18 12:04 Ryo Tsuruta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48D2715A.6060002@gmail.com \
--to=righi.andrea@gmail.com \
--cc=agk@sourceware.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.linux-foundation.org \
--cc=dm-devel@redhat.com \
--cc=fernando@oss.ntt.co.jp \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ryov@valinux.co.jp \
--cc=vgoyal@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=xemul@openvz.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.