From: Jerome Marchand <jmarchan@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: linux-kernel@vger.kernel.org, jens.axboe@oracle.com,
containers@lists.linux-foundation.org, dm-devel@redhat.com,
nauman@google.com, dpshah@google.com, lizf@cn.fujitsu.com,
mikew@google.com, fchecconi@gmail.com, paolo.valente@unimore.it,
ryov@valinux.co.jp, fernando@oss.ntt.co.jp,
s-uchida@ap.jp.nec.com, taka@valinux.co.jp,
guijianfeng@cn.fujitsu.com, jmoyer@redhat.com,
dhaval@linux.vnet.ibm.com, balbir@linux.vnet.ibm.com,
righi.andrea@gmail.com, m-ikeda@ds.jp.nec.com, agk@redhat.com,
akpm@linux-foundation.org, peterz@infradead.org,
torvalds@linux-foundation.org, mingo@elte.hu, riel@redhat.com
Subject: Re: [RFC] IO scheduler based IO controller V9
Date: Fri, 11 Sep 2009 16:44:37 +0200 [thread overview]
Message-ID: <4AAA6255.70807@redhat.com> (raw)
In-Reply-To: <20090911143040.GB6758@redhat.com>
Vivek Goyal wrote:
> On Fri, Sep 11, 2009 at 03:16:23PM +0200, Jerome Marchand wrote:
>> Vivek Goyal wrote:
>>> On Thu, Sep 10, 2009 at 04:52:27PM -0400, Vivek Goyal wrote:
>>>> On Thu, Sep 10, 2009 at 05:18:25PM +0200, Jerome Marchand wrote:
>>>>> Vivek Goyal wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Here is the V9 of the IO controller patches generated on top of 2.6.31-rc7.
>>>>>
>>>>> Hi Vivek,
>>>>>
>>>>> I've run some postgresql benchmarks for io-controller. Tests have been
>>>>> made with 2.6.31-rc6 kernel, without io-controller patches (when
>>>>> relevant) and with io-controller v8 and v9 patches.
>>>>> I set up two instances of the TPC-H database, each running in their
>>>>> own io-cgroup. I ran two clients to these databases and tested on each
>>>>> that simple request:
>>>>> $ select count(*) from LINEITEM;
>>>>> where LINEITEM is the biggest table of TPC-H (6001215 entries,
>>>>> 720MB). That request generates a steady stream of IOs.
>>>>>
>>>>> Time is measure by psql (\timing switched on). Each test is run twice
>>>>> or more if there is any significant difference between the first two
>>>>> runs. Before each run, the cache is flush:
>>>>> $ echo 3 > /proc/sys/vm/drop_caches
>>>>>
>>>>>
>>>>> Results with 2 groups of same io policy (BE) and same io weight (1000):
>>>>>
>>>>> w/o io-scheduler io-scheduler v8 io-scheduler v9
>>>>> first second first second first second
>>>>> DB DB DB DB DB DB
>>>>>
>>>>> CFQ 48.4s 48.4s 48.2s 48.2s 48.1s 48.5s
>>>>> Noop 138.0s 138.0s 48.3s 48.4s 48.5s 48.8s
>>>>> AS 46.3s 47.0s 48.5s 48.7s 48.3s 48.5s
>>>>> Deadl. 137.1s 137.1s 48.2s 48.3s 48.3s 48.5s
>>>>>
>>>>> As you can see, there is no significant difference for CFQ
>>>>> scheduler.
>>>> Thanks Jerome.
>>>>
>>>>> There is big improvement for noop and deadline schedulers
>>>>> (why is that happening?).
>>>> I think because now related IO is in a single queue and it gets to run
>>>> for 100ms or so (like CFQ). So previously, IO from both the instances
>>>> will go into a single queue which should lead to more seeks as requests
>>>> from two groups will kind of get interleaved.
>>>>
>>>> With io controller, both groups have separate queues so requests from
>>>> both the data based instances will not get interleaved (This almost
>>>> becomes like CFQ where ther are separate queues for each io context
>>>> and for sequential reader, one io context gets to run nicely for certain
>>>> ms based on its priority).
>>>>
>>>>> The performance with anticipatory scheduler
>>>>> is a bit lower (~4%).
>>>>>
>>> Hi Jerome,
>>>
>>> Can you also run the AS test with io controller patches and both the
>>> database in root group (basically don't put them in to separate group). I
>>> suspect that this regression might come from that fact that we now have
>>> to switch between queues and in AS we wait for request to finish from
>>> previous queue before next queue is scheduled in and probably that is
>>> slowing down things a bit.., just a wild guess..
>>>
>> Hi Vivek,
>>
>> I guess that's not the reason. I got 46.6s for both DB in root group with
>> io-controller v9 patches. I also rerun the test with DB in different groups
>> and found about the same result as above (48.3s and 48.6s).
>>
>
> Hi Jerome,
>
> Ok, so when both the DB's are in root group (with io-controller V9
> patches), then you get 46.6 seconds time for both the DBs. That means there
> is no regression in this case. In this case there is only one queue of
> root group and AS is running timed read/write batches on this queue.
>
> But when both the DBs are put in separate groups then you get 48.3 and
> 48.6 seconds respectively and we see regression. In this case there are
> two queues belonging to each group. Elevator layer takes care of queue
> group queue switch and AS runs timed read/write batches on these queues.
>
> If it is correct, then it does not exclude the possiblity that it is queue
> switching overhead between groups?
Yes it's correct. I misunderstood you.
Jerome
>
> Thanks
> Vivek
next prev parent reply other threads:[~2009-09-11 14:47 UTC|newest]
Thread overview: 113+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-28 21:30 [RFC] IO scheduler based IO controller V9 Vivek Goyal
2009-08-28 21:30 ` [PATCH 01/23] io-controller: Documentation Vivek Goyal
2009-08-28 21:30 ` [PATCH 02/23] io-controller: Core of the elevator fair queuing Vivek Goyal
2009-08-28 22:26 ` Rik van Riel
2009-08-28 21:30 ` [PATCH 03/23] io-controller: Common flat fair queuing code in elevaotor layer Vivek Goyal
2009-08-29 1:29 ` Rik van Riel
2009-08-28 21:30 ` [PATCH 04/23] io-controller: Modify cfq to make use of flat elevator fair queuing Vivek Goyal
2009-08-29 1:44 ` Rik van Riel
2009-08-28 21:30 ` [PATCH 05/23] io-controller: Core scheduler changes to support hierarhical scheduling Vivek Goyal
2009-08-29 3:31 ` Rik van Riel
2009-08-28 21:30 ` [PATCH 06/23] io-controller: cgroup related changes for hierarchical group support Vivek Goyal
2009-08-29 3:37 ` Rik van Riel
2009-08-28 21:30 ` [PATCH 07/23] io-controller: Common hierarchical fair queuing code in elevaotor layer Vivek Goyal
2009-08-29 23:04 ` Rik van Riel
2009-09-03 3:08 ` Munehiro Ikeda
2009-09-10 20:11 ` Vivek Goyal
2009-08-28 21:30 ` [PATCH 08/23] io-controller: cfq changes to use " Vivek Goyal
2009-08-29 23:11 ` Rik van Riel
2009-08-28 21:30 ` [PATCH 09/23] io-controller: Export disk time used and nr sectors dipatched through cgroups Vivek Goyal
2009-08-29 23:12 ` Rik van Riel
2009-08-28 21:30 ` [PATCH 10/23] io-controller: Debug hierarchical IO scheduling Vivek Goyal
2009-08-30 0:10 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 11/23] io-controller: Introduce group idling Vivek Goyal
2009-08-30 0:38 ` Rik van Riel
2009-09-18 3:56 ` [PATCH] io-controller: Fix another bug that causing system hanging Gui Jianfeng
2009-09-18 14:47 ` Vivek Goyal
2009-08-28 21:31 ` [PATCH 12/23] io-controller: Wait for requests to complete from last queue before new queue is scheduled Vivek Goyal
2009-08-30 0:40 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 13/23] io-controller: Separate out queue and data Vivek Goyal
2009-08-31 15:27 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 14/23] io-conroller: Prepare elevator layer for single queue schedulers Vivek Goyal
2009-08-31 2:49 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 15/23] io-controller: noop changes for hierarchical fair queuing Vivek Goyal
2009-08-31 2:52 ` Rik van Riel
2009-09-10 17:32 ` Vivek Goyal
2009-08-28 21:31 ` [PATCH 16/23] io-controller: deadline " Vivek Goyal
2009-08-31 3:13 ` Rik van Riel
2009-08-31 13:46 ` Vivek Goyal
2009-08-28 21:31 ` [PATCH 17/23] io-controller: anticipatory " Vivek Goyal
2009-08-31 17:21 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 18/23] io-controller: blkio_cgroup patches from Ryo to track async bios Vivek Goyal
2009-08-31 17:34 ` Rik van Riel
2009-08-31 18:56 ` Vivek Goyal
2009-08-31 23:51 ` Nauman Rafique
2009-09-01 7:00 ` Ryo Tsuruta
2009-09-01 14:11 ` Vivek Goyal
2009-09-01 14:53 ` Rik van Riel
2009-09-01 18:02 ` Nauman Rafique
2009-09-02 0:59 ` KAMEZAWA Hiroyuki
2009-09-02 3:12 ` Balbir Singh
2009-09-02 9:52 ` Ryo Tsuruta
2009-09-02 13:58 ` Vivek Goyal
2009-09-03 2:24 ` Ryo Tsuruta
2009-09-03 2:40 ` Vivek Goyal
2009-09-03 3:41 ` Ryo Tsuruta
2009-08-28 21:31 ` [PATCH 19/23] io-controller: map async requests to appropriate cgroup Vivek Goyal
2009-08-31 17:39 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 20/23] io-controller: Per cgroup request descriptor support Vivek Goyal
2009-08-31 17:54 ` Rik van Riel
2009-09-14 18:33 ` Nauman Rafique
2009-09-16 18:47 ` Vivek Goyal
2009-08-28 21:31 ` [PATCH 21/23] io-controller: Per io group bdi congestion interface Vivek Goyal
2009-08-31 19:49 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 22/23] io-controller: Support per cgroup per device weights and io class Vivek Goyal
2009-08-31 20:56 ` Rik van Riel
2009-08-28 21:31 ` [PATCH 23/23] io-controller: debug elevator fair queuing support Vivek Goyal
2009-08-31 20:57 ` Rik van Riel
2009-08-31 21:01 ` Vivek Goyal
2009-08-31 21:12 ` Rik van Riel
2009-08-31 1:09 ` [RFC] IO scheduler based IO controller V9 Gui Jianfeng
2009-09-02 0:58 ` Gui Jianfeng
2009-09-02 13:45 ` Vivek Goyal
2009-09-07 2:14 ` Gui Jianfeng
2009-09-08 13:55 ` Vivek Goyal
2009-09-07 7:40 ` Gui Jianfeng
2009-09-08 13:53 ` Vivek Goyal
2009-09-08 19:19 ` Vivek Goyal
2009-09-09 7:38 ` Gui Jianfeng
2009-09-09 15:05 ` Vivek Goyal
2009-09-10 3:20 ` Gui Jianfeng
2009-09-11 1:15 ` [PATCH] io-controller: Fix task hanging when there are more than one groups Gui Jianfeng
2009-09-14 2:44 ` Vivek Goyal
2009-09-15 3:37 ` Vivek Goyal
2009-09-16 0:05 ` Gui Jianfeng
2009-09-16 2:58 ` Gui Jianfeng
2009-09-16 18:09 ` Vivek Goyal
2009-09-17 6:08 ` Gui Jianfeng
2009-09-24 1:10 ` Gui Jianfeng
2009-09-09 9:41 ` [RFC] IO scheduler based IO controller V9 Jens Axboe
2009-09-08 22:28 ` Vivek Goyal
2009-09-08 22:28 ` [PATCH 24/23] io-controller: Don't leave a queue active when a disk is idle Vivek Goyal
2009-09-09 3:39 ` Rik van Riel
2009-09-08 22:28 ` [PATCH 25/23] io-controller: fix queue vs group fairness Vivek Goyal
2009-09-08 22:37 ` Daniel Walker
2009-09-09 1:09 ` Vivek Goyal
2009-09-08 23:13 ` Fabio Checconi
2009-09-09 1:32 ` Vivek Goyal
2009-09-09 2:03 ` Fabio Checconi
2009-09-09 4:44 ` Rik van Riel
2009-09-08 22:28 ` [PATCH 26/23] io-controller: fix writer preemption with in a group Vivek Goyal
2009-09-09 4:59 ` Rik van Riel
2009-09-10 15:18 ` [RFC] IO scheduler based IO controller V9 Jerome Marchand
2009-09-10 20:52 ` Vivek Goyal
2009-09-10 20:56 ` Vivek Goyal
2009-09-11 13:16 ` Jerome Marchand
2009-09-11 14:30 ` Vivek Goyal
2009-09-11 14:43 ` Vivek Goyal
2009-09-11 14:55 ` Jerome Marchand
2009-09-11 15:01 ` Vivek Goyal
2009-09-11 14:44 ` Jerome Marchand [this message]
2009-09-14 14:26 ` Jerome Marchand
2009-09-13 18:54 ` Vivek Goyal
2009-09-14 14:31 ` Jerome Marchand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AAA6255.70807@redhat.com \
--to=jmarchan@redhat.com \
--cc=agk@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.linux-foundation.org \
--cc=dhaval@linux.vnet.ibm.com \
--cc=dm-devel@redhat.com \
--cc=dpshah@google.com \
--cc=fchecconi@gmail.com \
--cc=fernando@oss.ntt.co.jp \
--cc=guijianfeng@cn.fujitsu.com \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=m-ikeda@ds.jp.nec.com \
--cc=mikew@google.com \
--cc=mingo@elte.hu \
--cc=nauman@google.com \
--cc=paolo.valente@unimore.it \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=righi.andrea@gmail.com \
--cc=ryov@valinux.co.jp \
--cc=s-uchida@ap.jp.nec.com \
--cc=taka@valinux.co.jp \
--cc=torvalds@linux-foundation.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).