From: Vivek Goyal <vgoyal@redhat.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: linux-kernel@vger.kernel.org, jens.axboe@oracle.com,
nauman@google.com, guijianfeng@cn.fujitsu.com
Subject: Re: [RFC] Improve CFQ fairness
Date: Fri, 4 Sep 2009 13:36:42 -0400 [thread overview]
Message-ID: <20090904173642.GA10880@redhat.com> (raw)
In-Reply-To: <x491vmox8bn.fsf@segfault.boston.devel.redhat.com>
On Thu, Sep 03, 2009 at 01:10:52PM -0400, Jeff Moyer wrote:
> Vivek Goyal <vgoyal@redhat.com> writes:
>
> > Hi,
> >
> > Sometimes fairness and throughput are orthogonal to each other. CFQ provides
> > fair access to disk to different processes in terms of disk time used by the
> > process.
> >
> > Currently above notion of fairness seems to be valid only for sync queues
> > whose think time is within slice_idle (8ms by default) limit.
> >
> > To boost throughput, CFQ disables idling based on seek patterns also. So even
> > if a sync queue's think time is with-in slice_idle limit, but this sync queue
> > is seeky, then CFQ will disable idling on hardware supporting NCQ.
> >
> > Above is fine from throughput perspective but not necessarily from fairness
> > perspective. In general CFQ seems to be inclined to favor throughput over
> > fairness.
> >
> > How about introducing a CFQ ioscheduler tunable "fairness" which if set, will
> > help CFQ to determine that user is interested in getting fairness right
> > and will disable some of the hooks geared towards throughput.
> >
> > Two patches in this series introduce the tunable "fairness" and also do not
> > disable the idling based on seek patterns if "fairness" is set.
> >
> > I ran four "dd" prio 0 BE class sequential readers on SATA disk.
> >
> > # Test script
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile1
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile3
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile4
>
> > Normally one would expect that these processes should finish in almost similar
> > time but following are the results of one of the runs (results vary between runs).
>
> Actually, what you've written above would run each dd in sequence. I
> get the idea, though.
>
> > 234179072 bytes (234 MB) copied, 6.0338 s, 38.8 MB/s
> > 234179072 bytes (234 MB) copied, 6.34077 s, 36.9 MB/s
> > 234179072 bytes (234 MB) copied, 8.4014 s, 27.9 MB/s
> > 234179072 bytes (234 MB) copied, 10.8469 s, 21.6 MB/s
> >
> > Different between first and last process finishing is almost 5 seconds (Out of
> > total 10 seconds duration). This seems to be too big a variance.
> >
> > I ran the blktrace to find out what is happening, and it seems we are very
> > quick to disable idling based mean seek distance. Somehow initial 7-10 reads
>
> I submitted a patch to fix that, so maybe this isn't a problem anymore?
> Here are my results, with fairness=0:
Hi Jeff,
I still seem to be getting the same behavior. I am using 2.6.31-rc7. I got
a SATA drive which supports command queuing with depth of 31.
Following are results of three runs.
234179072 bytes (234 MB) copied, 5.98348 s, 39.1 MB/s
234179072 bytes (234 MB) copied, 8.24508 s, 28.4 MB/s
234179072 bytes (234 MB) copied, 8.54762 s, 27.4 MB/s
234179072 bytes (234 MB) copied, 11.005 s, 21.3 MB/s
234179072 bytes (234 MB) copied, 5.51245 s, 42.5 MB/s
234179072 bytes (234 MB) copied, 5.62906 s, 41.6 MB/s
234179072 bytes (234 MB) copied, 9.44299 s, 24.8 MB/s
234179072 bytes (234 MB) copied, 10.9674 s, 21.4 MB/s
234179072 bytes (234 MB) copied, 5.50074 s, 42.6 MB/s
234179072 bytes (234 MB) copied, 5.62541 s, 41.6 MB/s
234179072 bytes (234 MB) copied, 8.63945 s, 27.1 MB/s
234179072 bytes (234 MB) copied, 10.9058 s, 21.5 MB/s
Thanks
Vivek
>
> # cat test.sh
> #!/bin/bash
>
> ionice -c 2 -n 0 dd if=/mnt/test/testfile1 of=/dev/null count=524288 &
> ionice -c 2 -n 0 dd if=/mnt/test/testfile2 of=/dev/null count=524288 &
> ionice -c 2 -n 0 dd if=/mnt/test/testfile3 of=/dev/null count=524288 &
> ionice -c 2 -n 0 dd if=/mnt/test/testfile4 of=/dev/null count=524288 &
>
> wait
>
> # bash test.sh
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.3071 s, 26.0 MB/s
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.3591 s, 25.9 MB/s
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.4217 s, 25.8 MB/s
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.4649 s, 25.7 MB/s
>
> That looks pretty good to me.
>
> Running a couple of fio workloads doesn't really show a difference
> between a vanilla kernel and a patched cfq with fairness set to 1:
>
> Vanilla:
>
> total priority: 800
> total data transferred: 887264
> class prio ideal xferred %diff
> be 4 110908 124404 12
> be 4 110908 123380 11
> be 4 110908 118004 6
> be 4 110908 113396 2
> be 4 110908 107252 -4
> be 4 110908 98356 -12
> be 4 110908 96244 -14
> be 4 110908 106228 -5
>
> Patched, with fairness set to 1:
>
> total priority: 800
> total data transferred: 953312
> class prio ideal xferred %diff
> be 4 119164 127028 6
> be 4 119164 128244 7
> be 4 119164 120564 1
> be 4 119164 127476 6
> be 4 119164 119284 0
> be 4 119164 116724 -3
> be 4 119164 103668 -14
> be 4 119164 110324 -8
>
> So, can you still reproduce this on your setup? I was just using a
> boring SATA disk.
>
> Cheers,
> out
prev parent reply other threads:[~2009-09-04 17:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-12 18:57 [RFC] Improve CFQ fairness Vivek Goyal
2009-07-12 18:57 ` [PATCH 1/2] cfq-iosched: Introduce an ioscheduler tunable "fairness" Vivek Goyal
2009-07-12 18:57 ` [PATCH 2/2] cfq-iosched: Do not disable idling for seeky processes if "fairness" is set Vivek Goyal
2009-07-13 21:19 ` [RFC] Improve CFQ fairness Divyesh Shah
2009-07-13 21:33 ` Vivek Goyal
2009-09-03 17:10 ` Jeff Moyer
2009-09-04 17:36 ` Vivek Goyal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090904173642.GA10880@redhat.com \
--to=vgoyal@redhat.com \
--cc=guijianfeng@cn.fujitsu.com \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nauman@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox