From: Jens Axboe <jens.axboe@oracle.com>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andrea Arcangeli <andrea@suse.de>,
akpm@linux-foundation.org, Ingo Molnar <mingo@elte.hu>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, ltt-dev@lists.casi.polymtl.ca
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency
Date: Sat, 17 Jan 2009 20:04:38 +0100 [thread overview]
Message-ID: <20090117190437.GZ30821@kernel.dk> (raw)
In-Reply-To: <20090117162657.GA31965@Krystal>
On Sat, Jan 17 2009, Mathieu Desnoyers wrote:
> A long standing I/O regression (since 2.6.18, still there today) has hit
> Slashdot recently :
> http://bugzilla.kernel.org/show_bug.cgi?id=12309
> http://it.slashdot.org/article.pl?sid=09/01/15/049201
>
> I've taken a trace reproducing the wrong behavior on my machine and I
> think it's getting us somewhere.
>
> LTTng 0.83, kernel 2.6.28
> Machine : Intel Xeon E5405 dual quad-core, 16GB ram
> (just created a new block-trace.c LTTng probe which is not released yet.
> It basically replaces blktrace)
>
>
> echo 3 > /proc/sys/vm/drop_caches
>
> lttctl -C -w /tmp/trace -o channel.mm.bufnum=8 -o channel.block.bufnum=64 trace
>
> dd if=/dev/zero of=/tmp/newfile bs=1M count=1M
> cp -ax music /tmp (copying 1.1GB of mp3)
>
> ls (takes 15 seconds to get the directory listing !)
>
> lttctl -D trace
>
> I looked at the trace (especially at the ls surroundings), and bash is
> waiting for a few seconds for I/O in the exec system call (to exec ls).
>
> While this happens, we have dd doing lots and lots of bio_queue. There
> is a bio_backmerge after each bio_queue event. This is reasonable,
> because dd is writing to a contiguous file.
>
> However, I wonder if this is not the actual problem. We have dd which
> has the head request in the elevator request queue. It is progressing
> steadily by plugging/unplugging the device periodically and gets its
> work done. However, because requests are being dequeued at the same
> rate others are being merged, I suspect it stays at the top of the queue
> and does not let the other unrelated requests run.
>
> There is a test in the blk-merge.c which makes sure that merged requests
> do not get bigger than a certain size. However, if the request is
> steadily dequeued, I think this test is not doing anything.
>
>
> This patch implements a basic test to make sure we never merge more
> than 128 requests into the same request if it is the "last_merge"
> request. I have not been able to trigger the problem again with the
> fix applied. It might not be in a perfect state : there may be better
> solutions to the problem, but I think it helps pointing out where the
> culprit lays.
To be painfully honest, I have no idea what you are attempting to solve
with this patch. First of all, Linux has always merged any request
possible. The one-hit cache is just that, a one hit cache frontend for
merging. We'll be hitting the merge hash and doing the same merge if it
fails. Since we even cap the size of the request, the merging is also
bounded.
Furthermore, the request being merged is not considered for IO yet. It
has not been dispatched by the io scheduler. IOW, I'm surprised your
patch makes any difference at all. Especially with your 128 limit, since
4kbx128kb is 512kb which is the default max merge size anyway. These
sort of test cases tend to be very sensitive and exhibit different
behaviour for many runs, so call me a bit skeptical and consider that an
enouragement to do more directed testing. You could use fio for
instance. Have two jobs in your job file. One is a dd type process that
just writes a huge file, the other job starts eg 10 seconds later and
does a 4kb read of a file.
As a quick test, could you try and increase the slice_idle to eg 20ms?
Sometimes I've seen timing being slightly off, which makes us miss the
sync window for the ls (in your case) process. Then you get a mix of
async and sync IO all the time, which very much slows down the sync
process.
--
Jens Axboe
next prev parent reply other threads:[~2009-01-17 19:06 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-17 0:44 [Regression] High latency when doing large I/O Mathieu Desnoyers
2009-01-17 16:26 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-01-17 16:50 ` Leon Woestenberg
2009-01-17 17:15 ` Mathieu Desnoyers
2009-01-17 19:04 ` Jens Axboe [this message]
2009-01-18 21:12 ` Mathieu Desnoyers
2009-01-18 21:27 ` Mathieu Desnoyers
2009-01-19 18:26 ` Jens Axboe
2009-01-20 2:10 ` Mathieu Desnoyers
2009-01-20 7:37 ` Jens Axboe
2009-01-20 12:28 ` Jens Axboe
2009-01-20 14:22 ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 14:24 ` Jens Axboe
2009-01-20 15:42 ` Mathieu Desnoyers
2009-01-20 23:06 ` Mathieu Desnoyers
2009-01-20 23:27 ` Mathieu Desnoyers
2009-01-21 0:25 ` Mathieu Desnoyers
2009-01-21 4:38 ` Ben Gamari
2009-01-21 4:54 ` [ltt-dev] " Mathieu Desnoyers
2009-01-21 6:17 ` Ben Gamari
2009-01-22 22:59 ` Mathieu Desnoyers
2009-01-23 3:21 ` [ltt-dev] " KOSAKI Motohiro
2009-01-23 4:03 ` Mathieu Desnoyers
2009-02-10 3:36 ` [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O Mathieu Desnoyers
2009-02-10 3:55 ` Nick Piggin
2009-02-10 5:23 ` Linus Torvalds
2009-02-10 5:56 ` Nick Piggin
2009-02-10 6:12 ` Mathieu Desnoyers
2009-02-02 2:08 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-02-02 11:26 ` Jens Axboe
2009-02-03 0:46 ` Mathieu Desnoyers
2009-01-20 13:45 ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 20:22 ` Ben Gamari
2009-01-20 22:23 ` Ben Gamari
2009-01-20 23:05 ` Mathieu Desnoyers
2009-01-22 2:35 ` Ben Gamari
2009-01-19 15:45 ` Nikanth K
2009-01-19 18:23 ` Jens Axboe
2009-01-17 20:03 ` Ben Gamari
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090117190437.GZ30821@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=andrea@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=ltt-dev@lists.casi.polymtl.ca \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).