public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] cfq-iosched: Implement group idle V2
@ 2010-07-19 17:14 Vivek Goyal
  2010-07-19 17:14 ` [PATCH 1/3] cfq-iosched: Improve time slice charging logic Vivek Goyal
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Vivek Goyal @ 2010-07-19 17:14 UTC (permalink / raw)
  To: linux-kernel, jens.axboe
  Cc: nauman, dpshah, guijianfeng, jmoyer, czoccolo, vgoyal


Hi,

This is V2 of the group_idle implementation patchset. I have done some more
testing since V1 and fixed couple of bugs since V1.

What's the problem
------------------
On high end storage (I got on HP EVA storage array with 12 SATA disks in 
RAID 5), CFQ's model of dispatching requests from a single queue at a
time (sequential readers/write sync writers etc), becomes a bottleneck.
Often we don't drive enough request queue depth to keep all the disks busy
and suffer a lot in terms of overall throughput.

All these problems primarily originate from two things. Idling on per
cfq queue and quantum (dispatching limited number of requests from a
single queue) and till then not allowing dispatch from other queues. Once
you set the slice_idle=0 and quantum to higher value, most of the CFQ's
problem on higher end storage disappear.

This problem also becomes visible in IO controller where one creates
multiple groups and gets the fairness but overall throughput is less. In
the following table, I am running increasing number of sequential readers
(1,2,4,8) in 8 groups of weight 100 to 800.

Kernel=2.6.35-rc5-gi-sl-accounting+
GROUPMODE=1          NRGRP=8             
DIR=/mnt/iostestmnt/fio        DEV=/dev/dm-4                 
Workload=bsr      iosched=cfq     Filesz=512M bs=4K   
group_isolation=1 slice_idle=8    group_idle=8    quantum=8    
=========================================================================
AVERAGE[bsr]    [bw in KB/s]    
------- 
job     Set NR  test1  test2  test3  test4  test5  test6  test7  test8  total  
---     --- --  ---------------------------------------------------------------
bsr     1   1   6245   12776  16591  23471  28746  36799  43031  49778  217437 
bsr     1   2   5100   11063  17216  23136  23738  30391  35910  40874  187428 
bsr     1   4   4623   9718   14746  18356  22875  30407  33215  38073  172013 
bsr     1   8   4720   10143  13499  19115  22157  29126  31688  30784  161232 

Notice that overall throughput is just around 160MB/s with 8 sequential reader
in each group.

With this patch set, I have set slice_idle=0 and re-ran same test.

Kernel=2.6.35-rc5-gi-sl-accounting+
GROUPMODE=1          NRGRP=8             
DIR=/mnt/iostestmnt/fio        DEV=/dev/dm-4                 
Workload=bsr      iosched=cfq     Filesz=512M bs=4K   
group_isolation=1 slice_idle=0    group_idle=8    quantum=8    
=========================================================================
AVERAGE[bsr]    [bw in KB/s]    
------- 
job     Set NR  test1  test2  test3  test4  test5  test6  test7  test8  total  
---     --- --  ---------------------------------------------------------------
bsr     1   1   6789   12764  17174  23111  28528  36807  42753  48826  216752 
bsr     1   2   9845   20617  30521  39061  45514  51607  63683  63770  324618 
bsr     1   4   14835  24241  42444  55207  45914  51318  54661  60318  348938 
bsr     1   8   12022  24478  36732  48651  54333  60974  64856  72930  374976 

Notice how overall throughput has shot upto 374MB/s while retaining the ability
to do the IO control.

So this is not the default mode. This new tunable group_idle, allows one to
set slice_idle=0 to disable some of the CFQ features and and use primarily
group service differentation feature.

If you have thoughts on other ways of solving the problem, I am all ears
to it.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 9+ messages in thread
* [RFC PATCH] cfq-iosched: Implement group idle V2
@ 2010-07-19 17:20 Vivek Goyal
  2010-07-19 17:20 ` [PATCH 2/3] cfq-iosched: Implement a new tunable group_idle Vivek Goyal
  0 siblings, 1 reply; 9+ messages in thread
From: Vivek Goyal @ 2010-07-19 17:20 UTC (permalink / raw)
  To: linux-kernel, axboe; +Cc: nauman, dpshah, guijianfeng, jmoyer, czoccolo, vgoyal


[ Got Jens's mail id wrong in last post hence reposting. Sorry for cluttering
your mailboxes.]

Hi,

This is V2 of the group_idle implementation patchset. I have done some more
testing since V1 and fixed couple of bugs since V1.

What's the problem
------------------
On high end storage (I got on HP EVA storage array with 12 SATA disks in 
RAID 5), CFQ's model of dispatching requests from a single queue at a
time (sequential readers/write sync writers etc), becomes a bottleneck.
Often we don't drive enough request queue depth to keep all the disks busy
and suffer a lot in terms of overall throughput.

All these problems primarily originate from two things. Idling on per
cfq queue and quantum (dispatching limited number of requests from a
single queue) and till then not allowing dispatch from other queues. Once
you set the slice_idle=0 and quantum to higher value, most of the CFQ's
problem on higher end storage disappear.

This problem also becomes visible in IO controller where one creates
multiple groups and gets the fairness but overall throughput is less. In
the following table, I am running increasing number of sequential readers
(1,2,4,8) in 8 groups of weight 100 to 800.

Kernel=2.6.35-rc5-gi-sl-accounting+
GROUPMODE=1          NRGRP=8             
DIR=/mnt/iostestmnt/fio        DEV=/dev/dm-4                 
Workload=bsr      iosched=cfq     Filesz=512M bs=4K   
group_isolation=1 slice_idle=8    group_idle=8    quantum=8    
=========================================================================
AVERAGE[bsr]    [bw in KB/s]    
------- 
job     Set NR  test1  test2  test3  test4  test5  test6  test7  test8  total  
---     --- --  ---------------------------------------------------------------
bsr     1   1   6245   12776  16591  23471  28746  36799  43031  49778  217437 
bsr     1   2   5100   11063  17216  23136  23738  30391  35910  40874  187428 
bsr     1   4   4623   9718   14746  18356  22875  30407  33215  38073  172013 
bsr     1   8   4720   10143  13499  19115  22157  29126  31688  30784  161232 

Notice that overall throughput is just around 160MB/s with 8 sequential reader
in each group.

With this patch set, I have set slice_idle=0 and re-ran same test.

Kernel=2.6.35-rc5-gi-sl-accounting+
GROUPMODE=1          NRGRP=8             
DIR=/mnt/iostestmnt/fio        DEV=/dev/dm-4                 
Workload=bsr      iosched=cfq     Filesz=512M bs=4K   
group_isolation=1 slice_idle=0    group_idle=8    quantum=8    
=========================================================================
AVERAGE[bsr]    [bw in KB/s]    
------- 
job     Set NR  test1  test2  test3  test4  test5  test6  test7  test8  total  
---     --- --  ---------------------------------------------------------------
bsr     1   1   6789   12764  17174  23111  28528  36807  42753  48826  216752 
bsr     1   2   9845   20617  30521  39061  45514  51607  63683  63770  324618 
bsr     1   4   14835  24241  42444  55207  45914  51318  54661  60318  348938 
bsr     1   8   12022  24478  36732  48651  54333  60974  64856  72930  374976 

Notice how overall throughput has shot upto 374MB/s while retaining the ability
to do the IO control.

So this is not the default mode. This new tunable group_idle, allows one to
set slice_idle=0 to disable some of the CFQ features and and use primarily
group service differentation feature.

If you have thoughts on other ways of solving the problem, I am all ears
to it.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-07-19 21:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-19 17:14 [RFC PATCH] cfq-iosched: Implement group idle V2 Vivek Goyal
2010-07-19 17:14 ` [PATCH 1/3] cfq-iosched: Improve time slice charging logic Vivek Goyal
2010-07-19 17:14 ` [PATCH 2/3] cfq-iosched: Implement a new tunable group_idle Vivek Goyal
2010-07-19 20:54   ` Divyesh Shah
2010-07-19 21:04     ` Vivek Goyal
2010-07-19 17:14 ` [PATCH 3/3] cfq-iosched: Print per slice sectors dispatched in blktrace Vivek Goyal
  -- strict thread matches above, loose matches on Subject: below --
2010-07-19 17:20 [RFC PATCH] cfq-iosched: Implement group idle V2 Vivek Goyal
2010-07-19 17:20 ` [PATCH 2/3] cfq-iosched: Implement a new tunable group_idle Vivek Goyal
2010-07-19 18:58   ` Jeff Moyer
2010-07-19 20:20     ` Vivek Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox