From: Andrea Righi <righi.andrea@gmail.com>
To: Hirokazu Takahashi <taka@valinux.co.jp>
Cc: baramsori72@gmail.com, balbir@linux.vnet.ibm.com,
xen-devel@lists.xensource.com,
Satoshi UCHIDA <s-uchida@ap.jp.nec.com>,
containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, dm-devel@redhat.com,
agk@sourceware.org, dave@linux.vnet.ibm.com, ngupta@google.com
Subject: Re: RFC: I/O bandwidth controller
Date: Tue, 12 Aug 2008 15:07:43 +0200 (MEST) [thread overview]
Message-ID: <48A18B1F.6080000@gmail.com> (raw)
In-Reply-To: <48A18854.9020000@gmail.com>
Andrea Righi wrote:
> Hirokazu Takahashi wrote:
>>>>>>> 3. & 4. & 5. - I/O bandwidth shaping & General design aspects
>>>>>>>
>>>>>>> The implementation of an I/O scheduling algorithm is to a certain extent
>>>>>>> influenced by what we are trying to achieve in terms of I/O bandwidth
>>>>>>> shaping, but, as discussed below, the required accuracy can determine
>>>>>>> the layer where the I/O controller has to reside. Off the top of my
>>>>>>> head, there are three basic operations we may want perform:
>>>>>>> - I/O nice prioritization: ionice-like approach.
>>>>>>> - Proportional bandwidth scheduling: each process/group of processes
>>>>>>> has a weight that determines the share of bandwidth they receive.
>>>>>>> - I/O limiting: set an upper limit to the bandwidth a group of tasks
>>>>>>> can use.
>>>>>> Use a deadline-based IO scheduling could be an interesting path to be
>>>>>> explored as well, IMHO, to try to guarantee per-cgroup minimum bandwidth
>>>>>> requirements.
>>>>> Please note that the only thing we can do is to guarantee minimum
>>>>> bandwidth requirement when there is contention for an IO resource, which
>>>>> is precisely what a proportional bandwidth scheduler does. An I missing
>>>>> something?
>>>> Correct. Proportional bandwidth automatically allows to guarantee min
>>>> requirements (instead of IO limiting approach, that needs additional
>>>> mechanisms to achive this).
>>>>
>>>> In any case there's no guarantee for a cgroup/application to sustain
>>>> i.e. 10MB/s on a certain device, but this is a hard problem anyway, and
>>>> the best we can do is to try to satisfy "soft" constraints.
>>> I think guaranteeing the minimum I/O bandwidth is very important. In the
>>> business site, especially in streaming service system, administrator requires
>>> the functionality to satisfy QoS or performance of their service.
>>> Of course, IO throttling is important, but, personally, I think guaranteeing
>>> the minimum bandwidth is more important than limitation of maximum bandwidth
>>> to satisfy the requirement in real business sites.
>>> And I know Andrea’s io-throttle patch supports the latter case well and it is
>>> very stable.
>>> But, the first case(guarantee the minimum bandwidth) is not supported in any
>>> patches.
>>> Is there any plans to support it? and Is there any problems in implementing it?
>>> I think if IO controller can support guaranteeing the minimum bandwidth and
>>> work-conserving mode simultaneously, it more easily satisfies the requirement
>>> of the business sites.
>>> Additionally, I didn’t understand “Proportional bandwidth automatically allows
>>> to guarantee min
>>> requirements” and “soft constraints”.
>>> Can you give me a advice about this ?
>>> Thanks in advance.
>>>
>>> Dong-Jae Kang
>> I think this is what dm-ioband does.
>>
>> Let's say you make two groups share the same disk, and give them
>> 70% of the bandwidth the disk physically has and 30% respectively.
>> This means the former group is almost guaranteed to be able to use
>> 70% of the bandwidth even when the latter one is issuing quite
>> a lot of I/O requests.
>>
>> Yes, I know there exist head seek lags with traditional magnetic disks,
>> so it's important to improve the algorithm to reduce this overhead.
>>
>> And I think it is also possible to add a new scheduling policy to
>> guarantee the minimum bandwidth. It might be cool if some group can
>> use guranteed bandwidths and the other share the rest on proportional
>> bandwidth policy.
>>
>> Thanks,
>> Hirokazu Takahashi.
>
> With IO limiting approach minimum requirements are supposed to be
> guaranteed if the user configures a generic block device so that the sum
> of the limits doesn't exceed the total IO bandwidth of that device. But,
> in principle, there's nothing in "throttling" that guarantees "fairness"
> among different cgroups doing IO on the same block devices, that means
> there's nothing to guarantee minimum requirements (and this is the
> reason because I liked the Satoshi's CFQ-cgroup approach together with
> io-throttle).
>
> A more complicated issue is how to evaluate the total IO bandwidth of a
> generic device. We can use some kind of averaging/prediction, but
> basically it would be inaccurate due to the mechanic of disks (head
> seeks, but also caching, buffering mechanisms implemented directly into
> the device, etc.). It's a hard problem. And the same problem exists also
> for proportional bandwidth as well, in terms of IO rate predictability I
> mean.
BTW as I said in a previous email, an interesting path to be explored
IMHO could be to think in terms of IO time. So, look at the time an IO
request is issued to the drive, look at the time the request is served,
evaluate the difference and charge the consumed IO time to the
appropriate cgroup. Then dispatch IO requests in function of the
consumed IO time debts / credits, using for example a token-bucket
strategy. And probably the best place to implement the IO time
accounting is the elevator.
-Andrea
>
> The only difference is that with proportional bandwidth you know that
> (taking the same example reported by Hirokazu) with i.e. 10 similar IO
> requests, 7 will be dispatched to the first cgroup and 3 to the other
> cgroup. So, you don't need anything to guarantee "fairness", but it's
> hard also for this case to evaluate the cost of the 7 IO requests
> respect to the cost of the other 3 IO requests as seen by user
> applications, that is the cost the users care about.
>
> -Andrea
next prev parent reply other threads:[~2008-08-12 13:07 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-04 8:51 [PATCH 0/7] I/O bandwidth controller and BIO tracking Ryo Tsuruta
2008-08-04 8:52 ` [PATCH 1/7] dm-ioband: Patch of device-mapper driver Ryo Tsuruta
2008-08-04 8:52 ` [PATCH 2/7] dm-ioband: Documentation of design overview, installation, command reference and examples Ryo Tsuruta
2008-08-04 8:57 ` [PATCH 3/7] bio-cgroup: Introduction Ryo Tsuruta
2008-08-04 8:57 ` [PATCH 4/7] bio-cgroup: Split the cgroup memory subsystem into two parts Ryo Tsuruta
2008-08-04 8:59 ` [PATCH 5/7] bio-cgroup: Remove a lot of ifdefs Ryo Tsuruta
2008-08-04 9:00 ` [PATCH 6/7] bio-cgroup: Implement the bio-cgroup Ryo Tsuruta
2008-08-04 9:01 ` [PATCH 7/7] bio-cgroup: Add a cgroup support to dm-ioband Ryo Tsuruta
2008-08-08 7:10 ` [PATCH 6/7] bio-cgroup: Implement the bio-cgroup Takuya Yoshikawa
2008-08-08 8:30 ` Ryo Tsuruta
2008-08-08 9:42 ` Takuya Yoshikawa
2008-08-08 11:41 ` Ryo Tsuruta
2008-08-05 10:25 ` [PATCH 4/7] bio-cgroup: Split the cgroup memory subsystem into two parts Andrea Righi
2008-08-05 10:35 ` Hirokazu Takahashi
2008-08-06 7:54 ` KAMEZAWA Hiroyuki
2008-08-06 11:43 ` Hirokazu Takahashi
2008-08-06 13:45 ` kamezawa.hiroyu
2008-08-07 7:25 ` Hirokazu Takahashi
2008-08-07 8:21 ` KAMEZAWA Hiroyuki
2008-08-07 8:45 ` Hirokazu Takahashi
2008-08-04 17:20 ` Too many I/O controller patches Dave Hansen
2008-08-04 18:22 ` Andrea Righi
2008-08-04 19:02 ` Dave Hansen
2008-08-04 20:44 ` Andrea Righi
2008-08-04 20:50 ` Dave Hansen
2008-08-05 6:28 ` Hirokazu Takahashi
2008-08-05 5:55 ` Paul Menage
2008-08-05 6:03 ` Balbir Singh
2008-08-05 9:27 ` Andrea Righi
2008-08-05 16:25 ` Dave Hansen
2008-08-05 6:16 ` Hirokazu Takahashi
2008-08-05 9:31 ` Andrea Righi
2008-08-05 10:01 ` Hirokazu Takahashi
2008-08-05 2:50 ` Satoshi UCHIDA
2008-08-05 9:28 ` Andrea Righi
2008-08-05 13:17 ` Ryo Tsuruta
2008-08-05 16:20 ` Dave Hansen
2008-08-06 2:44 ` KAMEZAWA Hiroyuki
2008-08-06 3:30 ` Balbir Singh
2008-08-06 6:48 ` Hirokazu Takahashi
2008-08-05 12:01 ` Hirokazu Takahashi
2008-08-04 18:34 ` Balbir Singh
2008-08-04 20:42 ` Andrea Righi
2008-08-06 1:13 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Fernando Luis Vázquez Cao
2008-08-06 6:18 ` RFC: I/O bandwidth controller Ryo Tsuruta
2008-08-06 6:41 ` Fernando Luis Vázquez Cao
2008-08-06 15:48 ` Dave Hansen
2008-08-07 4:38 ` Fernando Luis Vázquez Cao
2008-08-06 16:42 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Balbir Singh
2008-08-06 18:00 ` Dave Hansen
2008-08-07 2:44 ` Fernando Luis Vázquez Cao
2008-08-07 3:01 ` Fernando Luis Vázquez Cao
2008-08-08 11:39 ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-12 5:35 ` Fernando Luis Vázquez Cao
2008-08-06 19:37 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Naveen Gupta
2008-08-07 8:30 ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-07 13:17 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Fernando Luis Vázquez Cao
2008-08-11 18:18 ` Naveen Gupta
2008-08-11 16:35 ` David Collier-Brown
2008-08-07 7:46 ` Andrea Righi
2008-08-07 13:59 ` Fernando Luis Vázquez Cao
2008-08-11 20:52 ` Andrea Righi
[not found] ` <loom.20080812T071504-212@post.gmane.org>
2008-08-12 11:10 ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-12 12:55 ` Andrea Righi
2008-08-12 13:07 ` Andrea Righi [this message]
2008-08-12 13:54 ` Fernando Luis Vázquez Cao
2008-08-12 15:03 ` James.Smart
2008-08-12 21:00 ` Andrea Righi
2008-08-12 20:44 ` Andrea Righi
2008-08-13 7:47 ` Dong-Jae Kang
2008-08-13 17:56 ` Andrea Righi
2008-08-14 11:18 ` David Collier-Brown
2008-08-12 13:15 ` Fernando Luis Vázquez Cao
2008-08-13 6:23 ` 강동재
2008-08-08 6:21 ` Hirokazu Takahashi
2008-08-08 7:20 ` Ryo Tsuruta
2008-08-08 8:10 ` Fernando Luis Vázquez Cao
2008-08-08 10:05 ` Ryo Tsuruta
2008-08-08 14:31 ` Hirokazu Takahashi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48A18B1F.6080000@gmail.com \
--to=righi.andrea@gmail.com \
--cc=agk@sourceware.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=baramsori72@gmail.com \
--cc=containers@lists.linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ngupta@google.com \
--cc=s-uchida@ap.jp.nec.com \
--cc=taka@valinux.co.jp \
--cc=virtualization@lists.linux-foundation.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox