From: Vivek Goyal <vgoyal@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Glauber Costa <glommer@parallels.com>,
linux-kernel@vger.kernel.org, Michal Hocko <mhocko@suse.cz>,
Li Zefan <lizf@cn.fujitsu.com>,
Peter Zijlstra <peterz@infradead.org>,
Paul Turner <pjt@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Thomas Graf <tgraf@suug.ch>,
"Serge E. Hallyn" <serue@us.ibm.com>,
Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
Neil Horman <nhorman@tuxdriver.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Subject: Block IO controller hierarchy suppport (Was: Re: [PATCH RFC cgroup/for-3.7] cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them)
Date: Thu, 13 Sep 2012 10:53:41 -0400 [thread overview]
Message-ID: <20120913145340.GI4396@redhat.com> (raw)
In-Reply-To: <20120912170933.GO7677@google.com>
On Wed, Sep 12, 2012 at 10:09:33AM -0700, Tejun Heo wrote:
[..]
> Yeah, it's mostly that cfq was already a hairy monster before blkcg
> was added to it and unfortunately we didn't make it any cleaner in the
> process and blkcg itself has a lot of other issues including being
> completely broken w.r.t. writeback writes. In addition there are two
> sub-controllers - the cfq one and blk-throttle. So, it's just that
> there are too many scary things to do and not enough man power or
> maybe interest. I hope we could just declare cgroup isn't supported
> on block devices but that doesn't seem feasible at this point either.
>
> I might / probably work on it and am hoping to coerce Vivek into it
> too. If you wanna jump in, please be my guest.
Biggest problem with blkcg CFQ implementation is idling on cgroup. If
we don't idle on cgroup, then we don't get the service differentiaton
for most of the workloads and if we do idle then performance starts
to suck very soon (The moment few cgroups are created). And hierarchy
will just exacertbate this problem because then one will try to idle
at each group in hierarchy.
This problem is something similar to CFQ's idling on sequential queues
and iopriority. Because we never idled on random IO queue, ioprios never
worked on random IO queues. And same is true for buffered write queues.
Similary, if you don't idle on groups, then for most of the workloads,
service differentiation is not visible. Only the one which are highly
sequential on nature, one can see service differentiation.
That's one fundamental problem for which we need to have a good answer
before we try to do more work on blkcg. Because we can write as much
code but at the end of the day it might still not be useful because
of the above mentioned issue I faced.
And that's the reason I think blkcg is primarly useful when you create
number of cgroups very small and move offending/problem creating worklods
in that cgroup and keep all other running in root cgroup. That way you
get less idling due to less number of cgroups at the same time you have
provided more isolation from offending workloads.
So if anybody has ideas on how to address above issue, I am all ears.
Thanks
Vivek
next prev parent reply other threads:[~2012-09-13 14:55 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-10 22:31 [PATCH RFC cgroup/for-3.7] cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them Tejun Heo
2012-09-10 22:33 ` [PATCH REPOST " Tejun Heo
2012-09-11 10:04 ` Michal Hocko
2012-09-11 17:07 ` Tejun Heo
2012-09-12 15:47 ` Michal Hocko
2012-09-12 16:41 ` Tejun Heo
[not found] ` <5050568B.9090601@parallels.com>
2012-09-12 15:49 ` Michal Hocko
2012-09-12 17:11 ` Tejun Heo
2012-09-13 12:14 ` Michal Hocko
2012-09-13 17:18 ` Tejun Heo
2012-09-13 17:39 ` Michal Hocko
[not found] ` <5052E87A.1050405@parallels.com>
2012-09-14 19:15 ` Tejun Heo
[not found] ` <5051CB24.4010801@parallels.com>
2012-09-13 17:21 ` Tejun Heo
2012-09-11 12:38 ` Li Zefan
2012-09-11 17:08 ` Tejun Heo
2012-09-11 17:43 ` Tejun Heo
[not found] ` <505057D8.4010908@parallels.com>
2012-09-12 16:34 ` Tejun Heo
2012-09-13 6:48 ` Li Zefan
2012-09-11 18:23 ` [PATCH UPDATED " Tejun Heo
2012-09-11 20:50 ` Aristeu Rozanski
2012-09-11 20:51 ` Tejun Heo
2012-09-13 12:16 ` [PATCH REPOST " Daniel P. Berrange
2012-09-13 17:52 ` Tejun Heo
2012-09-11 14:51 ` [PATCH " Vivek Goyal
2012-09-11 14:54 ` Vivek Goyal
2012-09-11 17:16 ` Tejun Heo
2012-09-11 17:35 ` Vivek Goyal
2012-09-11 17:55 ` Tejun Heo
2012-09-11 18:16 ` Vivek Goyal
2012-09-11 18:22 ` Tejun Heo
2012-09-11 18:38 ` Vivek Goyal
[not found] ` <50505C39.1050600@parallels.com>
2012-09-12 17:09 ` Tejun Heo
2012-09-13 14:53 ` Vivek Goyal [this message]
2012-09-13 22:06 ` Block IO controller hierarchy suppport (Was: Re: [PATCH RFC cgroup/for-3.7] cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them) Tejun Heo
2012-09-14 2:53 ` Vivek Goyal
[not found] ` <5052E8DA.1000106@parallels.com>
2012-09-14 13:22 ` Vivek Goyal
[not found] ` <5051CBAA.5040308@parallels.com>
2012-09-13 17:54 ` [PATCH RFC cgroup/for-3.7] cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them Tejun Heo
[not found] ` <5052E931.8000007@parallels.com>
2012-09-14 18:56 ` Tejun Heo
[not found] ` <505055E5.90903@parallels.com>
2012-09-12 17:03 ` Tejun Heo
[not found] ` <5051C954.2080600@parallels.com>
2012-09-13 17:48 ` Tejun Heo
[not found] ` <5052E9BC.2020908@parallels.com>
2012-09-17 7:59 ` Daniel Wagner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120913145340.GI4396@redhat.com \
--to=vgoyal@redhat.com \
--cc=acme@ghostprotocols.net \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=glommer@parallels.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=mhocko@suse.cz \
--cc=mingo@redhat.com \
--cc=nhorman@tuxdriver.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=serue@us.ibm.com \
--cc=tgraf@suug.ch \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).