From: Andrea Righi <righi.andrea@gmail.com>
To: "Fernando Luis Vázquez Cao" <fernando@oss.ntt.co.jp>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>,
Ryo Tsuruta <ryov@valinux.co.jp>,
yoshikawa.takuya@oss.ntt.co.jp, taka@valinux.co.jp,
uchida@ap.jp.nec.com, ngupta@google.com,
linux-kernel@vger.kernel.org, dm-devel@redhat.com,
containers@lists.linux-foundation.org,
virtualization@lists.linux-foundation.org,
xen-devel@lists.xensource.com, agk@sourceware.org
Subject: Re: RFC: I/O bandwidth controller (was Re: Too many I/O controller patches)
Date: Mon, 11 Aug 2008 22:52:25 +0200 [thread overview]
Message-ID: <48A0A689.40908@gmail.com> (raw)
In-Reply-To: <1218117578.11703.81.camel@sebastian.kern.oss.ntt.co.jp>
Fernando Luis Vázquez Cao wrote:
>>> This seems to be the easiest part, but the current cgroups
>>> infrastructure has some limitations when it comes to dealing with block
>>> devices: impossibility of creating/removing certain control structures
>>> dynamically and hardcoding of subsystems (i.e. resource controllers).
>>> This makes it difficult to handle block devices that can be hotplugged
>>> and go away at any time (this applies not only to usb storage but also
>>> to some SATA and SCSI devices). To cope with this situation properly we
>>> would need hotplug support in cgroups, but, as suggested before and
>>> discussed in the past (see (0) below), there are some limitations.
>>>
>>> Even in the non-hotplug case it would be nice if we could treat each
>>> block I/O device as an independent resource, which means we could do
>>> things like allocating I/O bandwidth on a per-device basis. As long as
>>> performance is not compromised too much, adding some kind of basic
>>> hotplug support to cgroups is probably worth it.
>>>
>>> (0) http://lkml.org/lkml/2008/5/21/12
>> What about using major,minor numbers to identify each device and account
>> IO statistics? If a device is unplugged we could reset IO statistics
>> and/or remove IO limitations for that device from userspace (i.e. by a
>> deamon), but pluggin/unplugging the device would not be blocked/affected
>> in any case. Or am I oversimplifying the problem?
> If a resource we want to control (a block device in this case) is
> hot-plugged/unplugged the corresponding cgroup-related structures inside
> the kernel need to be allocated/freed dynamically, respectively. The
> problem is that this is not always possible. For example, with the
> current implementation of cgroups it is not possible to treat each block
> device as a different cgroup subsytem/resource controlled, because
> subsystems are created at compile time.
The whole subsystem is created at compile time, but controller data
structures are allocated dynamically (i.e. see struct mem_cgroup for
memory controller). So, identifying each device with a name, or a key
like major,minor, instead of a reference/pointer to a struct could help
to handle this in userspace. I mean, if a device is unplugged a
userspace daemon can just handle the event and delete the controller
data structures allocated for this device, asynchronously, via
userspace->kernel interface. And without holding a reference to that
particular block device in the kernel. Anyway, implementing a generic
interface that would allow to define hooks for hot-pluggable devices (or
similar events) in cgroups would be interesting.
>>> 3. & 4. & 5. - I/O bandwidth shaping & General design aspects
>>>
>>> The implementation of an I/O scheduling algorithm is to a certain extent
>>> influenced by what we are trying to achieve in terms of I/O bandwidth
>>> shaping, but, as discussed below, the required accuracy can determine
>>> the layer where the I/O controller has to reside. Off the top of my
>>> head, there are three basic operations we may want perform:
>>> - I/O nice prioritization: ionice-like approach.
>>> - Proportional bandwidth scheduling: each process/group of processes
>>> has a weight that determines the share of bandwidth they receive.
>>> - I/O limiting: set an upper limit to the bandwidth a group of tasks
>>> can use.
>> Use a deadline-based IO scheduling could be an interesting path to be
>> explored as well, IMHO, to try to guarantee per-cgroup minimum bandwidth
>> requirements.
> Please note that the only thing we can do is to guarantee minimum
> bandwidth requirement when there is contention for an IO resource, which
> is precisely what a proportional bandwidth scheduler does. An I missing
> something?
Correct. Proportional bandwidth automatically allows to guarantee min
requirements (instead of IO limiting approach, that needs additional
mechanisms to achive this).
In any case there's no guarantee for a cgroup/application to sustain
i.e. 10MB/s on a certain device, but this is a hard problem anyway, and
the best we can do is to try to satisfy "soft" constraints.
-Andrea
next prev parent reply other threads:[~2008-08-11 20:52 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-04 8:51 [PATCH 0/7] I/O bandwidth controller and BIO tracking Ryo Tsuruta
2008-08-04 8:52 ` [PATCH 1/7] dm-ioband: Patch of device-mapper driver Ryo Tsuruta
2008-08-04 8:52 ` [PATCH 2/7] dm-ioband: Documentation of design overview, installation, command reference and examples Ryo Tsuruta
2008-08-04 8:57 ` [PATCH 3/7] bio-cgroup: Introduction Ryo Tsuruta
2008-08-04 8:57 ` [PATCH 4/7] bio-cgroup: Split the cgroup memory subsystem into two parts Ryo Tsuruta
2008-08-04 8:59 ` [PATCH 5/7] bio-cgroup: Remove a lot of ifdefs Ryo Tsuruta
2008-08-04 9:00 ` [PATCH 6/7] bio-cgroup: Implement the bio-cgroup Ryo Tsuruta
2008-08-04 9:01 ` [PATCH 7/7] bio-cgroup: Add a cgroup support to dm-ioband Ryo Tsuruta
2008-08-08 7:10 ` [PATCH 6/7] bio-cgroup: Implement the bio-cgroup Takuya Yoshikawa
2008-08-08 8:30 ` Ryo Tsuruta
2008-08-08 9:42 ` Takuya Yoshikawa
2008-08-08 11:41 ` Ryo Tsuruta
2008-08-05 10:25 ` [PATCH 4/7] bio-cgroup: Split the cgroup memory subsystem into two parts Andrea Righi
2008-08-05 10:35 ` Hirokazu Takahashi
2008-08-06 7:54 ` KAMEZAWA Hiroyuki
2008-08-06 11:43 ` Hirokazu Takahashi
2008-08-06 13:45 ` kamezawa.hiroyu
2008-08-07 7:25 ` Hirokazu Takahashi
2008-08-07 8:21 ` KAMEZAWA Hiroyuki
2008-08-07 8:45 ` Hirokazu Takahashi
2008-08-04 17:20 ` Too many I/O controller patches Dave Hansen
2008-08-04 18:22 ` Andrea Righi
2008-08-04 19:02 ` Dave Hansen
2008-08-04 20:44 ` Andrea Righi
2008-08-04 20:50 ` Dave Hansen
2008-08-05 6:28 ` Hirokazu Takahashi
2008-08-05 5:55 ` Paul Menage
2008-08-05 6:03 ` Balbir Singh
2008-08-05 9:27 ` Andrea Righi
2008-08-05 16:25 ` Dave Hansen
2008-08-05 6:16 ` Hirokazu Takahashi
2008-08-05 9:31 ` Andrea Righi
2008-08-05 10:01 ` Hirokazu Takahashi
2008-08-05 2:50 ` Satoshi UCHIDA
2008-08-05 9:28 ` Andrea Righi
2008-08-05 13:17 ` Ryo Tsuruta
2008-08-05 16:20 ` Dave Hansen
2008-08-06 2:44 ` KAMEZAWA Hiroyuki
2008-08-06 3:30 ` Balbir Singh
2008-08-06 6:48 ` Hirokazu Takahashi
2008-08-05 12:01 ` Hirokazu Takahashi
2008-08-04 18:34 ` Balbir Singh
2008-08-04 20:42 ` Andrea Righi
2008-08-06 1:13 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Fernando Luis Vázquez Cao
2008-08-06 6:18 ` RFC: I/O bandwidth controller Ryo Tsuruta
2008-08-06 6:41 ` Fernando Luis Vázquez Cao
2008-08-06 15:48 ` Dave Hansen
2008-08-07 4:38 ` Fernando Luis Vázquez Cao
2008-08-06 16:42 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Balbir Singh
2008-08-06 18:00 ` Dave Hansen
2008-08-07 2:44 ` Fernando Luis Vázquez Cao
2008-08-07 3:01 ` Fernando Luis Vázquez Cao
2008-08-08 11:39 ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-12 5:35 ` Fernando Luis Vázquez Cao
2008-08-06 19:37 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Naveen Gupta
2008-08-07 8:30 ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-07 13:17 ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Fernando Luis Vázquez Cao
2008-08-11 18:18 ` Naveen Gupta
2008-08-11 16:35 ` David Collier-Brown
2008-08-07 7:46 ` Andrea Righi
2008-08-07 13:59 ` Fernando Luis Vázquez Cao
2008-08-11 20:52 ` Andrea Righi [this message]
[not found] ` <loom.20080812T071504-212@post.gmane.org>
2008-08-12 11:10 ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-12 12:55 ` Andrea Righi
2008-08-12 13:07 ` Andrea Righi
2008-08-12 13:54 ` Fernando Luis Vázquez Cao
2008-08-12 15:03 ` James.Smart
2008-08-12 21:00 ` Andrea Righi
2008-08-12 20:44 ` Andrea Righi
2008-08-13 7:47 ` Dong-Jae Kang
2008-08-13 17:56 ` Andrea Righi
2008-08-14 11:18 ` David Collier-Brown
2008-08-12 13:15 ` Fernando Luis Vázquez Cao
2008-08-13 6:23 ` 강동재
2008-08-08 6:21 ` Hirokazu Takahashi
2008-08-08 7:20 ` Ryo Tsuruta
2008-08-08 8:10 ` Fernando Luis Vázquez Cao
2008-08-08 10:05 ` Ryo Tsuruta
2008-08-08 14:31 ` Hirokazu Takahashi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48A0A689.40908@gmail.com \
--to=righi.andrea@gmail.com \
--cc=agk@sourceware.org \
--cc=containers@lists.linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=dm-devel@redhat.com \
--cc=fernando@oss.ntt.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=ngupta@google.com \
--cc=ryov@valinux.co.jp \
--cc=taka@valinux.co.jp \
--cc=uchida@ap.jp.nec.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=xen-devel@lists.xensource.com \
--cc=yoshikawa.takuya@oss.ntt.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox