From: Mathieu Desnoyers <compudj@krystal.dyndns.org>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: linux-kernel@vger.kernel.org, ltt-dev@lists.casi.polymtl.ca,
Andrea Arcangeli <andrea@suse.de>,
akpm@linux-foundation.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>
Subject: Re: [ltt-dev] [RFC PATCH] block: Fix bio merge induced high I/O latency
Date: Tue, 20 Jan 2009 08:45:39 -0500 [thread overview]
Message-ID: <20090120134539.GC22421@Krystal> (raw)
In-Reply-To: <20090120073709.GC30821@kernel.dk>
* Jens Axboe (jens.axboe@oracle.com) wrote:
> On Mon, Jan 19 2009, Mathieu Desnoyers wrote:
> > * Jens Axboe (jens.axboe@oracle.com) wrote:
> > > On Sun, Jan 18 2009, Mathieu Desnoyers wrote:
> > > > I looked at the "ls" behavior (while doing a dd) within my LTTng trace
> > > > to create a fio job file. The said behavior is appended below as "Part
> > > > 1 - ls I/O behavior". Note that the original "ls" test case was done
> > > > with the anticipatory I/O scheduler, which was active by default on my
> > > > debian system with custom vanilla 2.6.28 kernel. Also note that I am
> > > > running this on a raid-1, but have experienced the same problem on a
> > > > standard partition I created on the same machine.
> > > >
> > > > I created the fio job file appended as "Part 2 - dd+ls fio job file". It
> > > > consists of one dd-like job and many small jobs reading as many data as
> > > > ls did. I used the small test script to batch run this ("Part 3 - batch
> > > > test").
> > > >
> > > > The results for the ls-like jobs are interesting :
> > > >
> > > > I/O scheduler runt-min (msec) runt-max (msec)
> > > > noop 41 10563
> > > > anticipatory 63 8185
> > > > deadline 52 33387
> > > > cfq 43 1420
> > >
> >
> > Extra note : I have a HZ=250 on my system. Changing to 100 or 1000 did
> > not make much difference (also tried with NO_HZ enabled).
> >
> > > Do you have queuing enabled on your drives? You can check that in
> > > /sys/block/sdX/device/queue_depth. Try setting those to 1 and retest all
> > > schedulers, would be good for comparison.
> > >
> >
> > Here are the tests with a queue_depth of 1 :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > noop 43 38235
> > anticipatory 44 8728
> > deadline 51 19751
> > cfq 48 427
> >
> >
> > Overall, I wouldn't say it makes much difference.
>
> 0,5 seconds vs 1,5 seconds isn't much of a difference?
>
threefold.. yes, that's significant, but not in term of usability in
that specific case.
> > > raid personalities or dm complicates matters, since it introduces a
> > > disconnect between 'ls' and the io scheduler at the bottom...
> > >
> >
> > Yes, ideally I should re-run those directly on the disk partitions.
>
> At least for comparison.
>
Here it is. ssh test done on /dev/sda directly
queue_depth=31 (default)
/sys/block/sda/queue/iosched/slice_async_rq = 2 (default)
/sys/block/sda/queue/iosched/quantum = 4 (default)
I/O scheduler runt-min (msec) runt-max (msec)
noop 612 205684
anticipatory 562 5555
deadline 505 113153
cfq 523 6637
> > I am also tempted to create a fio job file which acts like a ssh server
> > receiving a connexion after it has been pruned from the cache while the
> > system if doing heavy I/O. "ssh", in this case, seems to be doing much
> > more I/O than a simple "ls", and I think we might want to see if cfq
> > behaves correctly in such case. Most of this I/O is coming from page
> > faults (identified as traps in the trace) probably because the ssh
> > executable has been thrown out of the cache by
> >
> > echo 3 > /proc/sys/vm/drop_caches
> >
> > The behavior of an incoming ssh connexion after clearing the cache is
> > appended below (Part 1 - LTTng trace for incoming ssh connexion). The
> > job file created (Part 2) reads, for each job, a 2MB file with random
> > reads each between 4k-44k. The results are very interesting for cfq :
> >
> > I/O scheduler runt-min (msec) runt-max (msec)
> > noop 586 110242
> > anticipatory 531 26942
> > deadline 561 108772
> > cfq 523 28216
> >
> > So, basically, ssh being out of the cache can take 28s to answer an
> > incoming ssh connexion even with the cfq scheduler. This is not exactly
> > what I would call an acceptable latency.
>
> At some point, you have to stop and consider what is acceptable
> performance for a given IO pattern. Your ssh test case is purely random
> IO, and neither CFQ nor AS would do any idling for that. We can make
> this test case faster for sure, the hard part is making sure that we
> don't regress on async throughput at the same time.
>
> Also remember that with your raid1, it's not entirely reasonable to
> blaim all performance issues on the IO scheduler as per my previous
> mail. It would be a lot more fair to view the disk numbers individually.
>
> Can you retry this job with 'quantum' set to 1 and 'slice_async_rq' set
> to 1 as well?
>
Sure, ssh test done on /dev/sda
queue_depth=31 (default)
/sys/block/sda/queue/iosched/slice_async_rq = 1
/sys/block/sda/queue/iosched/quantum = 1
I/O scheduler runt-min (msec) runt-max (msec)
cfq (default) 523 6637
cfq (s_rq=1,q=1) 503 6743
It did not do much difference.
Mathieu
> However, I think we should be doing somewhat better at this test case.
>
> --
> Jens Axboe
>
>
> _______________________________________________
> ltt-dev mailing list
> ltt-dev@lists.casi.polymtl.ca
> http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
>
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
next prev parent reply other threads:[~2009-01-20 13:46 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-17 0:44 [Regression] High latency when doing large I/O Mathieu Desnoyers
2009-01-17 16:26 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-01-17 16:50 ` Leon Woestenberg
2009-01-17 17:15 ` Mathieu Desnoyers
2009-01-17 19:04 ` Jens Axboe
2009-01-18 21:12 ` Mathieu Desnoyers
2009-01-18 21:27 ` Mathieu Desnoyers
2009-01-19 18:26 ` Jens Axboe
2009-01-20 2:10 ` Mathieu Desnoyers
2009-01-20 7:37 ` Jens Axboe
2009-01-20 12:28 ` Jens Axboe
2009-01-20 14:22 ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 14:24 ` Jens Axboe
2009-01-20 15:42 ` Mathieu Desnoyers
2009-01-20 23:06 ` Mathieu Desnoyers
2009-01-20 23:27 ` Mathieu Desnoyers
2009-01-21 0:25 ` Mathieu Desnoyers
2009-01-21 4:38 ` Ben Gamari
2009-01-21 4:54 ` [ltt-dev] " Mathieu Desnoyers
2009-01-21 6:17 ` Ben Gamari
2009-01-22 22:59 ` Mathieu Desnoyers
2009-01-23 3:21 ` [ltt-dev] " KOSAKI Motohiro
2009-01-23 4:03 ` Mathieu Desnoyers
2009-02-10 3:36 ` [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O Mathieu Desnoyers
2009-02-10 3:55 ` Nick Piggin
2009-02-10 5:23 ` Linus Torvalds
2009-02-10 5:56 ` Nick Piggin
2009-02-10 6:12 ` Mathieu Desnoyers
2009-02-02 2:08 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-02-02 11:26 ` Jens Axboe
2009-02-03 0:46 ` Mathieu Desnoyers
2009-01-20 13:45 ` Mathieu Desnoyers [this message]
2009-01-20 20:22 ` Ben Gamari
2009-01-20 22:23 ` Ben Gamari
2009-01-20 23:05 ` Mathieu Desnoyers
2009-01-22 2:35 ` Ben Gamari
2009-01-19 15:45 ` Nikanth K
2009-01-19 18:23 ` Jens Axboe
2009-01-17 20:03 ` Ben Gamari
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090120134539.GC22421@Krystal \
--to=compudj@krystal.dyndns.org \
--cc=akpm@linux-foundation.org \
--cc=andrea@suse.de \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ltt-dev@lists.casi.polymtl.ca \
--cc=mingo@elte.hu \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).