From: Andrea Righi <righi.andrea@gmail.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
nauman@google.com, dpshah@google.com, lizf@cn.fujitsu.com,
mikew@google.com, fchecconi@gmail.com, paolo.valente@unimore.it,
jens.axboe@oracle.com, ryov@valinux.co.jp,
fernando@intellilink.co.jp, s-uchida@ap.jp.nec.com,
taka@valinux.co.jp, guijianfeng@cn.fujitsu.com,
arozansk@redhat.com, jmoyer@redhat.com, oz-kernel@redhat.com,
dhaval@linux.vnet.ibm.com, balbir@linux.vnet.ibm.com,
linux-kernel@vger.kernel.org,
containers@lists.linux-foundation.org, menage@google.com,
peterz@infradead.org
Subject: Re: [PATCH 01/10] Documentation
Date: Sun, 05 Apr 2009 17:15:35 +0200 [thread overview]
Message-ID: <49D8CB17.7040501@gmail.com> (raw)
In-Reply-To: <20090312180126.GI10919@redhat.com>
On 2009-03-12 19:01, Vivek Goyal wrote:
> On Thu, Mar 12, 2009 at 12:11:46AM -0700, Andrew Morton wrote:
>> On Wed, 11 Mar 2009 21:56:46 -0400 Vivek Goyal <vgoyal@redhat.com> wrote:
[snip]
>> Also.. there are so many IO controller implementations that I've lost
>> track of who is doing what. I do have one private report here that
>> Andreas's controller "is incredibly productive for us and has allowed
>> us to put twice as many users per server with faster times for all
>> users". Which is pretty stunning, although it should be viewed as a
>> condemnation of the current code, I'm afraid.
>>
>
> I had looked briefly at Andrea's implementation in the past. I will look
> again. I had thought that this approach did not get much traction.
Hi Vivek, sorry for my late reply. I periodically upload the latest
versions of io-throttle here if you're still interested:
http://download.systemimager.org/~arighi/linux/patches/io-throttle/
There's no consistent changes respect to the latest version I posted to
the LKML, just rebasing to the recent kernels.
>
> Some quick thoughts about this approach though.
>
> - It is not a proportional weight controller. It is more of limiting
> bandwidth in absolute numbers for each cgroup on each disk.
>
> So each cgroup will define a rule for each disk in the system mentioning
> at what maximum rate that cgroup can issue IO to that disk and throttle
> the IO from that cgroup if rate has excedded.
Correct. Add also the proportional weight control has been in the TODO
list since the early versions, but I never dedicated too much effort to
implement this feature, I can focus on this and try to write something
if we all think it is worth to be done.
>
> Above requirement can create configuration problems.
>
> - If there are large number of disks in system, per cgroup one shall
> have to create rules for each disk. Until and unless admin knows
> what applications are in which cgroup and strictly what disk
> these applications do IO to and create rules for only those
> disks.
I don't think this is a huge problem anyway. IMHO a userspace tool, e.g.
a script, would be able to efficiently create/modify rules parsing user
defined rules in some human-readable form (config files, etc.), even in
presence of hundreds of disk. The same is valid for dm-ioband I think.
>
> - I think problem gets compounded if there is a hierarchy of
> logical devices. I think in that case one shall have to create
> rules for logical devices and not actual physical devices.
With logical devices you mean device-mapper devices (i.e. LVM, software
RAID, etc.)? or do you mean that we need to introduce the concept of
"logical device" to easily (quickly) configure IO requirements and then
map those logical devices to the actual physical devices? In this case I
think this can be addressed in userspace. Or maybe I'm totally missing
the point here.
>
> - Because it is not proportional weight distribution, if some
> cgroup is not using its planned BW, other group sharing the
> disk can not make use of spare BW.
>
Right.
> - I think one should know in advance the throughput rate of underlying media
> and also know competing applications so that one can statically define
> the BW assigned to each cgroup on each disk.
>
> This will be difficult. Effective BW extracted out of a rotational media
> is dependent on the seek pattern so one shall have to either try to make
> some conservative estimates and try to divide BW (we will not utilize disk
> fully) or take some peak numbers and divide BW (cgroup might not get the
> maximum rate configured).
Correct. I think the proportional weight approach is the only solution
to efficiently use the whole BW. OTOH absolute limiting rules offer a
better control over QoS, because you can totally remove performance
bursts/peaks that could break QoS requirements for short periods of
time. So, my "ideal" IO controller should allow to define both rules:
absolute and proportional limits.
I still have to look closely at your patchset anyway. I will do and give
a feedback.
-Andrea
next prev parent reply other threads:[~2009-04-05 15:15 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-12 1:56 [RFC] IO Controller Vivek Goyal
2009-03-12 1:56 ` [PATCH 01/10] Documentation Vivek Goyal
2009-03-12 7:11 ` Andrew Morton
2009-03-12 10:07 ` Ryo Tsuruta
2009-03-12 18:01 ` Vivek Goyal
2009-03-16 8:40 ` Ryo Tsuruta
2009-03-16 13:39 ` Vivek Goyal
2009-04-05 15:15 ` Andrea Righi [this message]
2009-04-06 6:50 ` Nauman Rafique
2009-04-07 6:40 ` Vivek Goyal
2009-04-08 20:37 ` Andrea Righi
2009-04-16 18:37 ` Vivek Goyal
2009-04-17 5:35 ` Dhaval Giani
2009-04-17 13:49 ` IO Controller discussion (Was: Re: [PATCH 01/10] Documentation) Vivek Goyal
2009-04-17 9:37 ` [PATCH 01/10] Documentation Andrea Righi
2009-04-17 14:13 ` IO controller discussion (Was: Re: [PATCH 01/10] Documentation) Vivek Goyal
2009-04-17 18:09 ` Nauman Rafique
2009-04-18 8:13 ` Andrea Righi
2009-04-19 12:59 ` Vivek Goyal
2009-04-19 13:08 ` Vivek Goyal
2009-04-17 22:38 ` Andrea Righi
2009-04-19 13:21 ` Vivek Goyal
2009-04-18 13:19 ` Balbir Singh
2009-04-19 13:45 ` Vivek Goyal
2009-04-19 15:53 ` Andrea Righi
2009-04-21 1:16 ` KAMEZAWA Hiroyuki
2009-04-19 4:35 ` Nauman Rafique
2009-03-12 7:45 ` [PATCH 01/10] Documentation Yang Hongyang
2009-03-12 13:51 ` Vivek Goyal
2009-03-12 10:00 ` Dhaval Giani
2009-03-12 14:04 ` Vivek Goyal
2009-03-12 14:48 ` Fabio Checconi
2009-03-12 15:03 ` Vivek Goyal
2009-03-18 7:23 ` Gui Jianfeng
2009-03-18 21:55 ` Vivek Goyal
2009-03-19 3:38 ` Gui Jianfeng
2009-03-24 5:32 ` Nauman Rafique
2009-03-24 12:58 ` Vivek Goyal
2009-03-24 18:14 ` Nauman Rafique
2009-03-24 18:29 ` Vivek Goyal
2009-03-24 18:41 ` Fabio Checconi
2009-03-24 18:35 ` Vivek Goyal
2009-03-24 18:49 ` Nauman Rafique
2009-03-24 19:04 ` Fabio Checconi
2009-03-12 10:24 ` Peter Zijlstra
2009-03-12 14:09 ` Vivek Goyal
2009-04-06 14:35 ` Balbir Singh
2009-04-06 22:00 ` Nauman Rafique
2009-04-07 5:59 ` Gui Jianfeng
2009-04-13 13:40 ` Vivek Goyal
2009-05-01 22:04 ` IKEDA, Munehiro
2009-05-01 22:45 ` IO Controller per cgroup request descriptors (Re: [PATCH 01/10] Documentation) Vivek Goyal
2009-05-01 23:39 ` Nauman Rafique
2009-05-04 17:18 ` IKEDA, Munehiro
2009-03-12 1:56 ` [PATCH 02/10] Common flat fair queuing code in elevaotor layer Vivek Goyal
2009-03-19 6:27 ` Gui Jianfeng
2009-03-27 8:30 ` [PATCH] IO Controller: Don't store the pid in single queue circumstances Gui Jianfeng
2009-03-27 13:52 ` Vivek Goyal
2009-04-02 4:06 ` [PATCH 02/10] Common flat fair queuing code in elevaotor layer Divyesh Shah
2009-04-02 13:52 ` Vivek Goyal
2009-03-12 1:56 ` [PATCH 03/10] Modify cfq to make use of flat elevator fair queuing Vivek Goyal
2009-03-12 1:56 ` [PATCH 04/10] Common hierarchical fair queuing code in elevaotor layer Vivek Goyal
2009-03-12 1:56 ` [PATCH 05/10] cfq changes to use " Vivek Goyal
2009-04-16 5:25 ` [PATCH] IO-Controller: Fix kernel panic after moving a task Gui Jianfeng
2009-04-16 19:15 ` Vivek Goyal
2009-03-12 1:56 ` [PATCH 06/10] Separate out queue and data Vivek Goyal
2009-03-12 1:56 ` [PATCH 07/10] Prepare elevator layer for single queue schedulers Vivek Goyal
2009-03-12 1:56 ` [PATCH 08/10] noop changes for hierarchical fair queuing Vivek Goyal
2009-03-12 1:56 ` [PATCH 09/10] deadline " Vivek Goyal
2009-03-12 1:56 ` [PATCH 10/10] anticipatory " Vivek Goyal
2009-03-27 6:58 ` [PATCH] IO Controller: No need to stop idling in as Gui Jianfeng
2009-03-27 14:05 ` Vivek Goyal
2009-03-30 1:09 ` Gui Jianfeng
2009-03-12 3:27 ` [RFC] IO Controller Takuya Yoshikawa
2009-03-12 6:40 ` anqin
2009-03-12 6:55 ` Li Zefan
2009-03-12 7:11 ` anqin
2009-03-12 14:57 ` Vivek Goyal
2009-03-12 13:46 ` Vivek Goyal
2009-03-12 13:43 ` Vivek Goyal
2009-04-02 6:39 ` Gui Jianfeng
2009-04-02 14:00 ` Vivek Goyal
2009-04-07 1:40 ` Gui Jianfeng
2009-04-07 6:40 ` Gui Jianfeng
2009-04-10 9:33 ` Gui Jianfeng
2009-04-10 17:49 ` Nauman Rafique
2009-04-13 13:09 ` Vivek Goyal
2009-04-22 3:04 ` Gui Jianfeng
2009-04-22 3:10 ` Nauman Rafique
2009-04-22 13:23 ` Vivek Goyal
2009-04-30 19:38 ` Nauman Rafique
2009-05-05 3:18 ` Gui Jianfeng
2009-05-01 1:25 ` Divyesh Shah
2009-05-01 2:45 ` Vivek Goyal
2009-05-01 3:00 ` Divyesh Shah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49D8CB17.7040501@gmail.com \
--to=righi.andrea@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=arozansk@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.linux-foundation.org \
--cc=dhaval@linux.vnet.ibm.com \
--cc=dpshah@google.com \
--cc=fchecconi@gmail.com \
--cc=fernando@intellilink.co.jp \
--cc=guijianfeng@cn.fujitsu.com \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=menage@google.com \
--cc=mikew@google.com \
--cc=nauman@google.com \
--cc=oz-kernel@redhat.com \
--cc=paolo.valente@unimore.it \
--cc=peterz@infradead.org \
--cc=ryov@valinux.co.jp \
--cc=s-uchida@ap.jp.nec.com \
--cc=taka@valinux.co.jp \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).