public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cfq-iosched: cfq-iosched: Implement group idling and IOPS accounting for groups V4
@ 2010-08-11 22:44 Vivek Goyal
  2010-08-11 22:44 ` [PATCH 1/5] cfq-iosched: Do not idle if slice_idle=0 Vivek Goyal
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Vivek Goyal @ 2010-08-11 22:44 UTC (permalink / raw)
  To: linux-kernel, jaxboe; +Cc: vgoyal


Hi,

This is V4 of the patches for group_idle and CFQ group charge accounting in
terms of IOPS implementation. Since V3 not much has changed. Just more testing
and rebase on top of for-2.6.36 branch of block tree.

What's the problem
------------------
On high end storage (I got on HP EVA storage array with 12 SATA disks in 
RAID 5), CFQ's model of dispatching requests from a single queue at a
time (sequential readers/write sync writers etc), becomes a bottleneck.
Often we don't drive enough request queue depth to keep all the disks busy
and suffer a lot in terms of overall throughput.

All these problems primarily originate from two things. Idling on per
cfq queue and quantum (dispatching limited number of requests from a
single queue) and till then not allowing dispatch from other queues. Once
you set the slice_idle=0 and quantum to higher value, most of the CFQ's
problem on higher end storage disappear.

This problem also becomes visible in IO controller where one creates
multiple groups and gets the fairness but overall throughput is less. In
the following table, I am running increasing number of sequential readers
(1,2,4,8) in 8 groups of weight 100 to 800.

Kernel=2.6.35-blktree-group_idle+
GROUPMODE=1          NRGRP=8      DEV=/dev/dm-3                 
Workload=bsr      iosched=cfq     Filesz=512M bs=4K   
gi=1  slice_idle=8    group_idle=8    quantum=8
=========================================================================
AVERAGE[bsr]    [bw in KB/s]    
------- 
job     Set NR  cgrp1  cgrp2  cgrp3  cgrp4  cgrp5  cgrp6  cgrp7  cgrp8  total  
---     --- --  ---------------------------------------------------------------
bsr     1   1   6519   12742  16801  23109  28694  35988  43175  49272  216300 
bsr     1   2   5522   10922  17174  22554  24151  30488  36572  42021  189404 
bsr     1   4   4593   9620   13120  21405  25827  28097  33029  37335  173026 
bsr     1   8   3622   8277   12557  18296  21775  26022  30760  35713  157022 


Notice that overall throughput is just around 160MB/s with 8 sequential reader
in each group.

With this patch set, I have set slice_idle=0 and re-ran same test.

Kernel=2.6.35-blktree-group_idle+
GROUPMODE=1          NRGRP=8         DEV=/dev/dm-3                 
Workload=bsr      iosched=cfq     Filesz=512M bs=4K   
gi=1  slice_idle=0    group_idle=8    quantum=8
=========================================================================
AVERAGE[bsr]    [bw in KB/s]    
------- 
job     Set NR  cgrp1  cgrp2  cgrp3  cgrp4  cgrp5  cgrp6  cgrp7  cgrp8  total  
---     --- --  ---------------------------------------------------------------
bsr     1   1   6652   12341  17335  23856  28740  36059  42833  48487  216303 
bsr     1   2   10168  20292  29827  38363  45746  52842  60071  63957  321266 
bsr     1   4   11176  21763  32713  42970  53222  58613  63598  69296  353351 
bsr     1   8   11750  23718  34102  47144  56975  63613  69000  69666  375968 

Notice how overall throughput has shot upto 350-370MB/s while retaining the
ability to do the IO control.

So this is not the default mode. This new tunable group_idle, allows one to
set slice_idle=0 to disable some of the CFQ features and and use primarily
group service differentation feature.

By default nothing should change for CFQ and this change should be fairly
low risk.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-08-23 10:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-11 22:44 [PATCH] cfq-iosched: cfq-iosched: Implement group idling and IOPS accounting for groups V4 Vivek Goyal
2010-08-11 22:44 ` [PATCH 1/5] cfq-iosched: Do not idle if slice_idle=0 Vivek Goyal
2010-08-16 18:37   ` Jeff Moyer
2010-08-11 22:44 ` [PATCH 2/5] cfq-iosched: Do group share accounting in IOPS when slice_idle=0 Vivek Goyal
2010-08-16 18:45   ` Jeff Moyer
2010-08-11 22:44 ` [PATCH 3/5] cfq-iosched: Implement tunable group_idle Vivek Goyal
2010-08-16 18:53   ` Jeff Moyer
2010-08-11 22:44 ` [PATCH 4/5] cfq-iosched: blktrace print per slice sector stats Vivek Goyal
2010-08-11 22:44 ` [PATCH 5/5] cfq-iosched: Documentation help for new tunables Vivek Goyal
2010-08-16 19:00   ` Jeff Moyer
2010-08-17 12:50     ` Vivek Goyal
2010-08-17 13:51       ` Jeff Moyer
2010-08-23 10:26 ` [PATCH] cfq-iosched: cfq-iosched: Implement group idling and IOPS accounting for groups V4 Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox