public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <righi.andrea@gmail.com>
To: "Fernando Luis Vázquez Cao" <fernando@oss.ntt.co.jp>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>,
	Ryo Tsuruta <ryov@valinux.co.jp>,
	yoshikawa.takuya@oss.ntt.co.jp, taka@valinux.co.jp,
	uchida@ap.jp.nec.com, ngupta@google.com,
	linux-kernel@vger.kernel.org, dm-devel@redhat.com,
	containers@lists.linux-foundation.org,
	virtualization@lists.linux-foundation.org,
	xen-devel@lists.xensource.com, agk@sourceware.org
Subject: Re: RFC: I/O bandwidth controller (was Re: Too many I/O controller patches)
Date: Mon, 11 Aug 2008 22:52:25 +0200	[thread overview]
Message-ID: <48A0A689.40908@gmail.com> (raw)
In-Reply-To: <1218117578.11703.81.camel@sebastian.kern.oss.ntt.co.jp>

Fernando Luis Vázquez Cao wrote:
>>> This seems to be the easiest part, but the current cgroups
>>> infrastructure has some limitations when it comes to dealing with block
>>> devices: impossibility of creating/removing certain control structures
>>> dynamically and hardcoding of subsystems (i.e. resource controllers).
>>> This makes it difficult to handle block devices that can be hotplugged
>>> and go away at any time (this applies not only to usb storage but also
>>> to some SATA and SCSI devices). To cope with this situation properly we
>>> would need hotplug support in cgroups, but, as suggested before and
>>> discussed in the past (see (0) below), there are some limitations.
>>>
>>> Even in the non-hotplug case it would be nice if we could treat each
>>> block I/O device as an independent resource, which means we could do
>>> things like allocating I/O bandwidth on a per-device basis. As long as
>>> performance is not compromised too much, adding some kind of basic
>>> hotplug support to cgroups is probably worth it.
>>>
>>> (0) http://lkml.org/lkml/2008/5/21/12
>> What about using major,minor numbers to identify each device and account
>> IO statistics? If a device is unplugged we could reset IO statistics
>> and/or remove IO limitations for that device from userspace (i.e. by a
>> deamon), but pluggin/unplugging the device would not be blocked/affected
>> in any case. Or am I oversimplifying the problem?
> If a resource we want to control (a block device in this case) is
> hot-plugged/unplugged the corresponding cgroup-related structures inside
> the kernel need to be allocated/freed dynamically, respectively. The
> problem is that this is not always possible. For example, with the
> current implementation of cgroups it is not possible to treat each block
> device as a different cgroup subsytem/resource controlled, because
> subsystems are created at compile time.

The whole subsystem is created at compile time, but controller data
structures are allocated dynamically (i.e. see struct mem_cgroup for
memory controller). So, identifying each device with a name, or a key
like major,minor, instead of a reference/pointer to a struct could help
to handle this in userspace. I mean, if a device is unplugged a
userspace daemon can just handle the event and delete the controller
data structures allocated for this device, asynchronously, via
userspace->kernel interface. And without holding a reference to that
particular block device in the kernel. Anyway, implementing a generic
interface that would allow to define hooks for hot-pluggable devices (or
similar events) in cgroups would be interesting.

>>> 3. & 4. & 5. - I/O bandwidth shaping & General design aspects
>>>
>>> The implementation of an I/O scheduling algorithm is to a certain extent
>>> influenced by what we are trying to achieve in terms of I/O bandwidth
>>> shaping, but, as discussed below, the required accuracy can determine
>>> the layer where the I/O controller has to reside. Off the top of my
>>> head, there are three basic operations we may want perform:
>>>   - I/O nice prioritization: ionice-like approach.
>>>   - Proportional bandwidth scheduling: each process/group of processes
>>> has a weight that determines the share of bandwidth they receive.
>>>   - I/O limiting: set an upper limit to the bandwidth a group of tasks
>>> can use.
>> Use a deadline-based IO scheduling could be an interesting path to be
>> explored as well, IMHO, to try to guarantee per-cgroup minimum bandwidth
>> requirements.
> Please note that the only thing we can do is to guarantee minimum
> bandwidth requirement when there is contention for an IO resource, which
> is precisely what a proportional bandwidth scheduler does. An I missing
> something?

Correct. Proportional bandwidth automatically allows to guarantee min
requirements (instead of IO limiting approach, that needs additional
mechanisms to achive this).

In any case there's no guarantee for a cgroup/application to sustain
i.e. 10MB/s on a certain device, but this is a hard problem anyway, and
the best we can do is to try to satisfy "soft" constraints.

-Andrea

  reply	other threads:[~2008-08-11 20:52 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-04  8:51 [PATCH 0/7] I/O bandwidth controller and BIO tracking Ryo Tsuruta
2008-08-04  8:52 ` [PATCH 1/7] dm-ioband: Patch of device-mapper driver Ryo Tsuruta
2008-08-04  8:52   ` [PATCH 2/7] dm-ioband: Documentation of design overview, installation, command reference and examples Ryo Tsuruta
2008-08-04  8:57     ` [PATCH 3/7] bio-cgroup: Introduction Ryo Tsuruta
2008-08-04  8:57       ` [PATCH 4/7] bio-cgroup: Split the cgroup memory subsystem into two parts Ryo Tsuruta
2008-08-04  8:59         ` [PATCH 5/7] bio-cgroup: Remove a lot of ifdefs Ryo Tsuruta
2008-08-04  9:00           ` [PATCH 6/7] bio-cgroup: Implement the bio-cgroup Ryo Tsuruta
2008-08-04  9:01             ` [PATCH 7/7] bio-cgroup: Add a cgroup support to dm-ioband Ryo Tsuruta
2008-08-08  7:10             ` [PATCH 6/7] bio-cgroup: Implement the bio-cgroup Takuya Yoshikawa
2008-08-08  8:30               ` Ryo Tsuruta
2008-08-08  9:42                 ` Takuya Yoshikawa
2008-08-08 11:41                   ` Ryo Tsuruta
2008-08-05 10:25         ` [PATCH 4/7] bio-cgroup: Split the cgroup memory subsystem into two parts Andrea Righi
2008-08-05 10:35           ` Hirokazu Takahashi
2008-08-06  7:54         ` KAMEZAWA Hiroyuki
2008-08-06 11:43           ` Hirokazu Takahashi
2008-08-06 13:45             ` kamezawa.hiroyu
2008-08-07  7:25               ` Hirokazu Takahashi
2008-08-07  8:21                 ` KAMEZAWA Hiroyuki
2008-08-07  8:45                   ` Hirokazu Takahashi
2008-08-04 17:20 ` Too many I/O controller patches Dave Hansen
2008-08-04 18:22   ` Andrea Righi
2008-08-04 19:02     ` Dave Hansen
2008-08-04 20:44       ` Andrea Righi
2008-08-04 20:50         ` Dave Hansen
2008-08-05  6:28           ` Hirokazu Takahashi
2008-08-05  5:55         ` Paul Menage
2008-08-05  6:03           ` Balbir Singh
2008-08-05  9:27           ` Andrea Righi
2008-08-05 16:25           ` Dave Hansen
2008-08-05  6:16         ` Hirokazu Takahashi
2008-08-05  9:31           ` Andrea Righi
2008-08-05 10:01             ` Hirokazu Takahashi
2008-08-05  2:50     ` Satoshi UCHIDA
2008-08-05  9:28       ` Andrea Righi
2008-08-05 13:17         ` Ryo Tsuruta
2008-08-05 16:20         ` Dave Hansen
2008-08-06  2:44           ` KAMEZAWA Hiroyuki
2008-08-06  3:30             ` Balbir Singh
2008-08-06  6:48             ` Hirokazu Takahashi
2008-08-05 12:01       ` Hirokazu Takahashi
2008-08-04 18:34   ` Balbir Singh
2008-08-04 20:42     ` Andrea Righi
2008-08-06  1:13   ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Fernando Luis Vázquez Cao
2008-08-06  6:18     ` RFC: I/O bandwidth controller Ryo Tsuruta
2008-08-06  6:41       ` Fernando Luis Vázquez Cao
2008-08-06 15:48         ` Dave Hansen
2008-08-07  4:38           ` Fernando Luis Vázquez Cao
2008-08-06 16:42     ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Balbir Singh
2008-08-06 18:00       ` Dave Hansen
2008-08-07  2:44       ` Fernando Luis Vázquez Cao
2008-08-07  3:01       ` Fernando Luis Vázquez Cao
2008-08-08 11:39         ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-12  5:35           ` Fernando Luis Vázquez Cao
2008-08-06 19:37     ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Naveen Gupta
2008-08-07  8:30       ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-07 13:17       ` RFC: I/O bandwidth controller (was Re: Too many I/O controller patches) Fernando Luis Vázquez Cao
2008-08-11 18:18         ` Naveen Gupta
2008-08-11 16:35           ` David Collier-Brown
2008-08-07  7:46     ` Andrea Righi
2008-08-07 13:59       ` Fernando Luis Vázquez Cao
2008-08-11 20:52         ` Andrea Righi [this message]
     [not found]           ` <loom.20080812T071504-212@post.gmane.org>
2008-08-12 11:10             ` RFC: I/O bandwidth controller Hirokazu Takahashi
2008-08-12 12:55               ` Andrea Righi
2008-08-12 13:07                 ` Andrea Righi
2008-08-12 13:54                   ` Fernando Luis Vázquez Cao
2008-08-12 15:03                     ` James.Smart
2008-08-12 21:00                       ` Andrea Righi
2008-08-12 20:44                     ` Andrea Righi
2008-08-13  7:47                       ` Dong-Jae Kang
2008-08-13 17:56                         ` Andrea Righi
2008-08-14 11:18                 ` David Collier-Brown
2008-08-12 13:15               ` Fernando Luis Vázquez Cao
2008-08-13  6:23               ` 강동재
2008-08-08  6:21     ` Hirokazu Takahashi
2008-08-08  7:20       ` Ryo Tsuruta
2008-08-08  8:10         ` Fernando Luis Vázquez Cao
2008-08-08 10:05           ` Ryo Tsuruta
2008-08-08 14:31       ` Hirokazu Takahashi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48A0A689.40908@gmail.com \
    --to=righi.andrea@gmail.com \
    --cc=agk@sourceware.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=dm-devel@redhat.com \
    --cc=fernando@oss.ntt.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ngupta@google.com \
    --cc=ryov@valinux.co.jp \
    --cc=taka@valinux.co.jp \
    --cc=uchida@ap.jp.nec.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xen-devel@lists.xensource.com \
    --cc=yoshikawa.takuya@oss.ntt.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox