From: Tao Ma <tm@tao.ma>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Shaohua Li <shli@kernel.org>, Tejun Heo <tj@kernel.org>,
axboe@kernel.dk, ctalbott@google.com, rni@google.com,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
containers@lists.linux-foundation.org
Subject: Re: IOPS based scheduler (Was: Re: [PATCH 18/21] blkcg: move blkio_group_conf->weight to cfq)
Date: Thu, 05 Apr 2012 00:45:05 +0800 [thread overview]
Message-ID: <4F7C7A91.8040707@tao.ma> (raw)
In-Reply-To: <20120404133705.GB12676@redhat.com>
On 04/04/2012 09:37 PM, Vivek Goyal wrote:
> On Wed, Apr 04, 2012 at 05:35:49AM -0700, Shaohua Li wrote:
>
> [..]
>>>> How iops_weight and switching different than CFQ group scheduling logic?
>>>> I think shaohua was talking of using similar logic. What would you do
>>>> fundamentally different so that without idling you will get service
>>>> differentiation?
>>> I am thinking of differentiate different groups with iops, so if there
>>> are 3 groups(the weight are 100, 200, 300) we can let them submit 1 io,
>>> 2 io and 3 io in a round-robin way. With a intel ssd, every io can be
>>> finished within 100us. So the maximum latency for one io is about 600us,
>>> still less than 1ms. But with cfq, if all the cgroups are busy, we have
>>> to switch between these group in ms which means the maximum latency will
>>> be 6ms. It is terrible for some applications since they use ssds now.
>> Yes, with iops based scheduling, we do queue switching for every request.
>> Doing the same thing between groups is quite straightforward. The only issue
>> I found is this will introduce more process context switch, this isn't
>> a big issue
>> for io bound application, but depends. It cuts latency a lot, which I
>> guess is more
>> important for web 2.0 application.
>
> In iops_mode(), expire each cfqq after dispatch of 1 or bunch of requests
> and you should get the same behavior (with slice_idle=0 and group_idle=0).
> So why write a new scheduler.
really? How could we config cfq to work like this? Or you mean we can
change the code for it?
>
> Only thing is that with above, current code will provide iops fairness only
> for groups. We should be able to tweak queue scheduling to support iops
> fairness also.
OK, as I have said in another e-mail another my concern is the
complexity. It will make cfq too much complicated. I just checked the
source code of shaohua's original patch, fiops scheduler is only ~700
lines, so with cgroup support added it would be ~1000 lines I guess.
Currently cfq-iosched.c is around ~4000 lines even after Tejun's cleanup
of io context...
Thanks
Tao
>
> Anyway, we will end up doing that at some point of time. Supporting two
> scheduling algorihtms for queue and groups is not sustainable. There are
> already calls to make CFQ hierarchical and in that case both queue and
> groups need to be on a single service tree and that means need to follow
> same algorithm for scheduling.
>
> Thanks
> Vivek
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2012-04-04 16:45 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-28 22:51 [PATCHSET] block: modularize blkcg config and stat file handling Tejun Heo
2012-03-28 22:51 ` [PATCH 01/21] blkcg: remove unused @pol and @plid parameters Tejun Heo
2012-03-28 22:51 ` [PATCH 02/21] blkcg: BLKIO_STAT_CPU_SECTORS doesn't have subcounters Tejun Heo
2012-03-28 22:51 ` [PATCH 03/21] blkcg: introduce blkg_stat and blkg_rwstat Tejun Heo
2012-03-28 22:51 ` [PATCH 04/21] blkcg: restructure statistics printing Tejun Heo
2012-03-28 22:51 ` [PATCH 05/21] blkcg: drop blkiocg_file_write_u64() Tejun Heo
2012-03-28 22:51 ` [PATCH 06/21] blkcg: restructure configuration printing Tejun Heo
2012-03-28 22:51 ` [PATCH 07/21] blkcg: restructure blkio_group configruation setting Tejun Heo
2012-03-28 22:51 ` [PATCH 08/21] blkcg: blkg_conf_prep() Tejun Heo
2012-03-28 22:53 ` Tejun Heo
2012-03-28 22:51 ` [PATCH 09/21] blkcg: export conf/stat helpers to prepare for reorganization Tejun Heo
2012-03-28 22:51 ` [PATCH 10/21] blkcg: implement blkio_policy_type->cftypes Tejun Heo
2012-03-28 22:51 ` [PATCH 11/21] blkcg: move conf/stat file handling code to policies Tejun Heo
2012-03-28 22:51 ` [PATCH 12/21] cfq: collapse cfq.h into cfq-iosched.c Tejun Heo
2012-03-28 22:51 ` [PATCH 13/21] blkcg: move statistics update code to policies Tejun Heo
2012-03-28 22:51 ` [PATCH 14/21] blkcg: cfq doesn't need per-cpu dispatch stats Tejun Heo
2012-03-28 22:51 ` [PATCH 15/21] blkcg: add blkio_policy_ops operations for exit and stat reset Tejun Heo
2012-03-28 22:51 ` [PATCH 16/21] blkcg: move blkio_group_stats to cfq-iosched.c Tejun Heo
2012-03-28 22:51 ` [PATCH 17/21] blkcg: move blkio_group_stats_cpu and friends to blk-throttle.c Tejun Heo
2012-03-28 22:51 ` [PATCH 18/21] blkcg: move blkio_group_conf->weight to cfq Tejun Heo
2012-04-01 21:09 ` Vivek Goyal
2012-04-01 21:22 ` Tejun Heo
2012-04-02 21:39 ` Tao Ma
2012-04-02 21:49 ` Tejun Heo
2012-04-02 22:03 ` Tao Ma
2012-04-02 22:17 ` Tejun Heo
2012-04-02 22:20 ` Tao Ma
2012-04-02 22:25 ` Vivek Goyal
2012-04-02 22:28 ` Tejun Heo
2012-04-02 22:41 ` Tao Ma
2012-04-03 15:37 ` IOPS based scheduler (Was: Re: [PATCH 18/21] blkcg: move blkio_group_conf->weight to cfq) Vivek Goyal
2012-04-03 16:36 ` Tao Ma
2012-04-03 16:50 ` Vivek Goyal
2012-04-03 17:26 ` Tao Ma
2012-04-04 12:35 ` Shaohua Li
2012-04-04 13:37 ` Vivek Goyal
2012-04-04 14:52 ` Shaohua Li
2012-04-04 15:10 ` Vivek Goyal
2012-04-04 16:06 ` Tao Ma
2012-04-04 16:45 ` Tao Ma [this message]
2012-04-04 16:50 ` Vivek Goyal
2012-04-04 17:17 ` Vivek Goyal
2012-04-04 17:18 ` Tao Ma
2012-04-04 17:27 ` Vivek Goyal
2012-04-04 18:22 ` Vivek Goyal
2012-04-04 18:36 ` Tao Ma
2012-04-04 13:31 ` Vivek Goyal
2012-03-28 22:51 ` [PATCH 19/21] blkcg: move blkio_group_conf->iops and ->bps to blk-throttle Tejun Heo
2012-03-28 22:51 ` [PATCH 20/21] blkcg: pass around pd->pdata instead of pd itself in prfill functions Tejun Heo
2012-03-28 22:51 ` [PATCH 21/21] blkcg: drop BLKCG_STAT_{PRIV|POL|OFF} macros Tejun Heo
2012-03-29 8:18 ` [PATCHSET] block: modularize blkcg config and stat file handling Jens Axboe
2012-04-02 20:02 ` Tejun Heo
2012-04-02 21:51 ` Jens Axboe
2012-04-02 22:33 ` Tejun Heo
2012-04-01 19:38 ` Vivek Goyal
2012-04-01 21:42 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F7C7A91.8040707@tao.ma \
--to=tm@tao.ma \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=containers@lists.linux-foundation.org \
--cc=ctalbott@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rni@google.com \
--cc=shli@kernel.org \
--cc=tj@kernel.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).