From: Vivek Goyal <vgoyal@redhat.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: linux-kernel@vger.kernel.org, jens.axboe@oracle.com,
nauman@google.com, guijianfeng@cn.fujitsu.com
Subject: Re: [RFC] Improve CFQ fairness
Date: Fri, 4 Sep 2009 13:36:42 -0400 [thread overview]
Message-ID: <20090904173642.GA10880@redhat.com> (raw)
In-Reply-To: <x491vmox8bn.fsf@segfault.boston.devel.redhat.com>
On Thu, Sep 03, 2009 at 01:10:52PM -0400, Jeff Moyer wrote:
> Vivek Goyal <vgoyal@redhat.com> writes:
>
> > Hi,
> >
> > Sometimes fairness and throughput are orthogonal to each other. CFQ provides
> > fair access to disk to different processes in terms of disk time used by the
> > process.
> >
> > Currently above notion of fairness seems to be valid only for sync queues
> > whose think time is within slice_idle (8ms by default) limit.
> >
> > To boost throughput, CFQ disables idling based on seek patterns also. So even
> > if a sync queue's think time is with-in slice_idle limit, but this sync queue
> > is seeky, then CFQ will disable idling on hardware supporting NCQ.
> >
> > Above is fine from throughput perspective but not necessarily from fairness
> > perspective. In general CFQ seems to be inclined to favor throughput over
> > fairness.
> >
> > How about introducing a CFQ ioscheduler tunable "fairness" which if set, will
> > help CFQ to determine that user is interested in getting fairness right
> > and will disable some of the hooks geared towards throughput.
> >
> > Two patches in this series introduce the tunable "fairness" and also do not
> > disable the idling based on seek patterns if "fairness" is set.
> >
> > I ran four "dd" prio 0 BE class sequential readers on SATA disk.
> >
> > # Test script
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile1
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile3
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile4
>
> > Normally one would expect that these processes should finish in almost similar
> > time but following are the results of one of the runs (results vary between runs).
>
> Actually, what you've written above would run each dd in sequence. I
> get the idea, though.
>
> > 234179072 bytes (234 MB) copied, 6.0338 s, 38.8 MB/s
> > 234179072 bytes (234 MB) copied, 6.34077 s, 36.9 MB/s
> > 234179072 bytes (234 MB) copied, 8.4014 s, 27.9 MB/s
> > 234179072 bytes (234 MB) copied, 10.8469 s, 21.6 MB/s
> >
> > Different between first and last process finishing is almost 5 seconds (Out of
> > total 10 seconds duration). This seems to be too big a variance.
> >
> > I ran the blktrace to find out what is happening, and it seems we are very
> > quick to disable idling based mean seek distance. Somehow initial 7-10 reads
>
> I submitted a patch to fix that, so maybe this isn't a problem anymore?
> Here are my results, with fairness=0:
Hi Jeff,
I still seem to be getting the same behavior. I am using 2.6.31-rc7. I got
a SATA drive which supports command queuing with depth of 31.
Following are results of three runs.
234179072 bytes (234 MB) copied, 5.98348 s, 39.1 MB/s
234179072 bytes (234 MB) copied, 8.24508 s, 28.4 MB/s
234179072 bytes (234 MB) copied, 8.54762 s, 27.4 MB/s
234179072 bytes (234 MB) copied, 11.005 s, 21.3 MB/s
234179072 bytes (234 MB) copied, 5.51245 s, 42.5 MB/s
234179072 bytes (234 MB) copied, 5.62906 s, 41.6 MB/s
234179072 bytes (234 MB) copied, 9.44299 s, 24.8 MB/s
234179072 bytes (234 MB) copied, 10.9674 s, 21.4 MB/s
234179072 bytes (234 MB) copied, 5.50074 s, 42.6 MB/s
234179072 bytes (234 MB) copied, 5.62541 s, 41.6 MB/s
234179072 bytes (234 MB) copied, 8.63945 s, 27.1 MB/s
234179072 bytes (234 MB) copied, 10.9058 s, 21.5 MB/s
Thanks
Vivek
>
> # cat test.sh
> #!/bin/bash
>
> ionice -c 2 -n 0 dd if=/mnt/test/testfile1 of=/dev/null count=524288 &
> ionice -c 2 -n 0 dd if=/mnt/test/testfile2 of=/dev/null count=524288 &
> ionice -c 2 -n 0 dd if=/mnt/test/testfile3 of=/dev/null count=524288 &
> ionice -c 2 -n 0 dd if=/mnt/test/testfile4 of=/dev/null count=524288 &
>
> wait
>
> # bash test.sh
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.3071 s, 26.0 MB/s
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.3591 s, 25.9 MB/s
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.4217 s, 25.8 MB/s
> 524288+0 records in
> 524288+0 records out
> 268435456 bytes (268 MB) copied, 10.4649 s, 25.7 MB/s
>
> That looks pretty good to me.
>
> Running a couple of fio workloads doesn't really show a difference
> between a vanilla kernel and a patched cfq with fairness set to 1:
>
> Vanilla:
>
> total priority: 800
> total data transferred: 887264
> class prio ideal xferred %diff
> be 4 110908 124404 12
> be 4 110908 123380 11
> be 4 110908 118004 6
> be 4 110908 113396 2
> be 4 110908 107252 -4
> be 4 110908 98356 -12
> be 4 110908 96244 -14
> be 4 110908 106228 -5
>
> Patched, with fairness set to 1:
>
> total priority: 800
> total data transferred: 953312
> class prio ideal xferred %diff
> be 4 119164 127028 6
> be 4 119164 128244 7
> be 4 119164 120564 1
> be 4 119164 127476 6
> be 4 119164 119284 0
> be 4 119164 116724 -3
> be 4 119164 103668 -14
> be 4 119164 110324 -8
>
> So, can you still reproduce this on your setup? I was just using a
> boring SATA disk.
>
> Cheers,
> out
prev parent reply other threads:[~2009-09-04 17:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-12 18:57 [RFC] Improve CFQ fairness Vivek Goyal
2009-07-12 18:57 ` [PATCH 1/2] cfq-iosched: Introduce an ioscheduler tunable "fairness" Vivek Goyal
2009-07-12 18:57 ` [PATCH 2/2] cfq-iosched: Do not disable idling for seeky processes if "fairness" is set Vivek Goyal
2009-07-13 21:19 ` [RFC] Improve CFQ fairness Divyesh Shah
2009-07-13 21:33 ` Vivek Goyal
2009-09-03 17:10 ` Jeff Moyer
2009-09-04 17:36 ` Vivek Goyal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090904173642.GA10880@redhat.com \
--to=vgoyal@redhat.com \
--cc=guijianfeng@cn.fujitsu.com \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nauman@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.