From: Vivek Goyal <vgoyal@redhat.com>
To: Divyesh Shah <dpshah@google.com>
Cc: linux-kernel@vger.kernel.org, jens.axboe@oracle.com,
nauman@google.com, guijianfeng@cn.fujitsu.com, jmoyer@redhat.com
Subject: Re: [RFC] Improve CFQ fairness
Date: Mon, 13 Jul 2009 17:33:51 -0400 [thread overview]
Message-ID: <20090713213351.GD3714@redhat.com> (raw)
In-Reply-To: <af41c7c40907131419t4dcca78fmba16c99bbf4ebc4c@mail.gmail.com>
On Mon, Jul 13, 2009 at 02:19:32PM -0700, Divyesh Shah wrote:
> Hi Vivek,
> I saw a similar issue when running some tests with parallel sync
> workloads. Looking at the blktrace output and staring at the
> idle_window and seek detection code I realized that the think time
> samples were taken for all consecutive IOs from a given cfqq. I think
> doing so is not entirely correct as it also includes very long ttime
> values for consecutive IOs which are separated by timeslices for other
> sync queues too. To get a good estimate of the arrival pattern for a
> cfqq we should only consider samples where the process was allowed to
> send consecutive IOs down to the disk.
> I have a patch that fixes this which I will rebase and post soon.
> This might help you avoid the idle window disabling.
Hi Divyesh,
I will be glad to try the patch but in my particular test case, we disable
the idle window not because of think time but because CFQ thinks it is
a seeky workload. CFQ currently disables the idle window for seeky process
on hardware supporting command queuing.
In fact in general, I am trying to solve the issue of fairness with CFQ
IO schedulers. There seem to places where we let go fairness to achive
better throughput/latency. And disabling the idle window for seeky
processes (even though think time is with-in slice_idle limit), seems to
be one of those cases.
Thanks
Vivek
>
> Regards,
> Divyesh
>
> On Sun, Jul 12, 2009 at 11:57 AM, Vivek Goyal<vgoyal@redhat.com> wrote:
> >
> > Hi,
> >
> > Sometimes fairness and throughput are orthogonal to each other. CFQ provides
> > fair access to disk to different processes in terms of disk time used by the
> > process.
> >
> > Currently above notion of fairness seems to be valid only for sync queues
> > whose think time is within slice_idle (8ms by default) limit.
> >
> > To boost throughput, CFQ disables idling based on seek patterns also. So even
> > if a sync queue's think time is with-in slice_idle limit, but this sync queue
> > is seeky, then CFQ will disable idling on hardware supporting NCQ.
> >
> > Above is fine from throughput perspective but not necessarily from fairness
> > perspective. In general CFQ seems to be inclined to favor throughput over
> > fairness.
> >
> > How about introducing a CFQ ioscheduler tunable "fairness" which if set, will
> > help CFQ to determine that user is interested in getting fairness right
> > and will disable some of the hooks geared towards throughput.
> >
> > Two patches in this series introduce the tunable "fairness" and also do not
> > disable the idling based on seek patterns if "fairness" is set.
> >
> > I ran four "dd" prio 0 BE class sequential readers on SATA disk.
> >
> > # Test script
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile1
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile3
> > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile4
> >
> > Normally one would expect that these processes should finish in almost similar
> > time but following are the results of one of the runs (results vary between runs).
> >
> > 234179072 bytes (234 MB) copied, 6.0338 s, 38.8 MB/s
> > 234179072 bytes (234 MB) copied, 6.34077 s, 36.9 MB/s
> > 234179072 bytes (234 MB) copied, 8.4014 s, 27.9 MB/s
> > 234179072 bytes (234 MB) copied, 10.8469 s, 21.6 MB/s
> >
> > Different between first and last process finishing is almost 5 seconds (Out of
> > total 10 seconds duration). This seems to be too big a variance.
> >
> > I ran the blktrace to find out what is happening, and it seems we are very
> > quick to disable idling based mean seek distance. Somehow initial 7-10 reads
> > seem to be seeky for these dd processes. After that things stablize and we
> > enable back the idling. But some of the processes get idling enabled early
> > and some get it enabled really late and that leads to discrepancy in results.
> >
> > With this patchset applied, following are the results for above test case.
> >
> > echo 1 > /sys/block/sdb/queue/iosched/fairness
> >
> > 234179072 bytes (234 MB) copied, 9.88874 s, 23.7 MB/s
> > 234179072 bytes (234 MB) copied, 10.0234 s, 23.4 MB/s
> > 234179072 bytes (234 MB) copied, 10.1747 s, 23.0 MB/s
> > 234179072 bytes (234 MB) copied, 10.4844 s, 22.3 MB/s
> >
> > Notice, how close the finish time and effective bandwidth are for all the
> > four processes. Also notice that I did not witness any throughput degradation
> > at least for this particular test case.
> >
> > Thanks
> > Vivek
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
next prev parent reply other threads:[~2009-07-13 21:34 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-12 18:57 [RFC] Improve CFQ fairness Vivek Goyal
2009-07-12 18:57 ` [PATCH 1/2] cfq-iosched: Introduce an ioscheduler tunable "fairness" Vivek Goyal
2009-07-12 18:57 ` [PATCH 2/2] cfq-iosched: Do not disable idling for seeky processes if "fairness" is set Vivek Goyal
2009-07-13 21:19 ` [RFC] Improve CFQ fairness Divyesh Shah
2009-07-13 21:33 ` Vivek Goyal [this message]
2009-09-03 17:10 ` Jeff Moyer
2009-09-04 17:36 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090713213351.GD3714@redhat.com \
--to=vgoyal@redhat.com \
--cc=dpshah@google.com \
--cc=guijianfeng@cn.fujitsu.com \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nauman@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.