Re: [RFC PATCH] block: Fix bio merge induced high I/O latency

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jens Axboe <jens.axboe@oracle.com>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: akpm@linux-foundation.org, Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, ltt-dev@lists.casi.polymtl.ca
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency
Date: Mon, 2 Feb 2009 12:26:01 +0100	[thread overview]
Message-ID: <20090202112600.GE30821@kernel.dk> (raw)
In-Reply-To: <20090202020811.GA19565@Krystal>

On Sun, Feb 01 2009, Mathieu Desnoyers wrote:
> Hi Jens,
> 
> I tried your patch at
> 
> http://bugzilla.kernel.org/attachment.cgi?id=20001
> 
> On a 2.6.29-rc3 kernel. I get the following OOPS just after I start
> running the fio test. It happens after a few
> 
> cfq: moving ffff88043d4b42e0 to dispatch                         
> cfq: moving ffff88043d4b4170 to dispatch                        
> 
> messages (~20).
> 
> Here is the oops :
> 
> ------------[ cut here ]------------
> kernel BUG at block/cfq-iosched.c:650!
> invalid opcode: 0000 [#1] PREEMPT SMP
> LTT NESTING LEVEL : 0
> last sysfs file: /sys/block/sda/stat
> CPU 2
> Modules linked in: loop ltt_tracer ltt_trace_control ltt_userspa]
> Pid: 2934, comm: kjournald Not tainted 2.6.29-rc3 #3
> RIP: 0010:[<ffffffff80419c2b>]  [<ffffffff80419c2b>] cfq_remove_0
> RSP: 0018:ffff88043b167c20  EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffff88043fd9e088 RCX: 0000000000000001
> RDX: 0000000000000010 RSI: ffff88043887b590 RDI: ffff88043887b590
> RBP: ffff88043b167c50 R08: 0000000000000002 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88043fd9e088
> R13: ffff88043887b590 R14: ffff88043fc40200 R15: ffff88043fd9e088
> FS:  0000000000000000(0000) GS:ffff88043e81a080(0000) knlGS:00000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00007f2a5f98b8c0 CR3: 000000043e8c4000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process kjournald (pid: 2934, threadinfo ffff88043b166000, task )
> Stack:
>  000000000000003b ffff88043887b590 ffff88043fd9e088 ffff88043e5a0
>  ffff88043fc40200 ffff88002809ed50 ffff88043b167c80 ffffffff8041d
>  0000000000000001 ffff88043887b590 ffffe2001b805138 ffff88043e5a0
> Call Trace:
>  [<ffffffff80419e4d>] cfq_dispatch_insert+0x3d/0x70
>  [<ffffffff80419f2f>] cfq_wait_on_page+0xaf/0xc0
>  [<ffffffff804098ed>] elv_wait_on_page+0x1d/0x20
>  [<ffffffff8040d207>] blk_backing_dev_wop+0x17/0x50
>  [<ffffffff80301872>] sync_buffer+0x52/0x80
>  [<ffffffff806a33b2>] __wait_on_bit+0x62/0x90
>  [<ffffffff80301820>] ? sync_buffer+0x0/0x80
>  [<ffffffff80301820>] ? sync_buffer+0x0/0x80
>  [<ffffffff806a3459>] out_of_line_wait_on_bit+0x79/0x90
>  [<ffffffff8025a8a0>] ? wake_bit_function+0x0/0x50
>  [<ffffffff80301769>] __wait_on_buffer+0xf9/0x130
>  [<ffffffff80379acd>] journal_commit_transaction+0x72d/0x1650
>  [<ffffffff806a5c87>] ? _spin_unlock_irqrestore+0x47/0x80
>  [<ffffffff8024dd2f>] ? try_to_del_timer_sync+0x5f/0x70
>  [<ffffffff8037e488>] kjournald+0xe8/0x250
>  [<ffffffff8025a860>] ? autoremove_wake_function+0x0/0x40
>  [<ffffffff8037e3a0>] ? kjournald+0x0/0x250
>  [<ffffffff8025a38e>] kthread+0x4e/0x90
>  [<ffffffff8025a340>] ? kthread+0x0/0x90
>  [<ffffffff8020db2a>] child_rip+0xa/0x20
>  [<ffffffff8020d480>] ? restore_args+0x0/0x30
>  [<ffffffff8025a340>] ? kthread+0x0/0x90
>  [<ffffffff8020db20>] ? child_rip+0x0/0x20
> Code: 4d 89 6d 00 49 8b 9d c0 00 00 00 41 8b 45 48 4c 8b 73 08 2
> RIP  [<ffffffff80419c2b>] cfq_remove_request+0x6b/0x250
>  RSP <ffff88043b167c20>
> ---[ end trace eab134a8bd405d05 ]---
> 
> It seems that the cfqq->queued[sync] counter should either be
> incremented/decremented in the new cfq_wait_on_page, or that the fact
> that the type of request (sync vs !sync) changes would not be taken care
> of correctly. I have not looked at the code enough to find out exactly
> what is happening, but I though you might have an idea of the cause.

Just ignore the patch for now, I'm not going to be spending more time on
it. It was just an attempt at a quick test, I don't think this approach
is very feasible since it doesn't appear to be the root of the problem.
In any case, were we to continue on this path, the accounting logic in
CFQ would have to be adjusted for this new behaviour. Otherwise there's
a big risk of giving great preference to async writeout once things get
tight.

It's also working around the real problem for this specific issue, which
is that you just don't want to have sync apps blocked waiting for async
writeout in the first place.

-- 
Jens Axboe

next prev parent reply	other threads:[~2009-02-02 11:28 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-17  0:44 [Regression] High latency when doing large I/O Mathieu Desnoyers
2009-01-17 16:26 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-01-17 16:50   ` Leon Woestenberg
2009-01-17 17:15     ` Mathieu Desnoyers
2009-01-17 19:04   ` Jens Axboe
2009-01-18 21:12     ` Mathieu Desnoyers
2009-01-18 21:27       ` Mathieu Desnoyers
2009-01-19 18:26       ` Jens Axboe
2009-01-20  2:10         ` Mathieu Desnoyers
2009-01-20  7:37           ` Jens Axboe
2009-01-20 12:28             ` Jens Axboe
2009-01-20 14:22               ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 14:24                 ` Jens Axboe
2009-01-20 15:42                   ` Mathieu Desnoyers
2009-01-20 23:06                     ` Mathieu Desnoyers
2009-01-20 23:27               ` Mathieu Desnoyers
2009-01-21  0:25                 ` Mathieu Desnoyers
2009-01-21  4:38                   ` Ben Gamari
2009-01-21  4:54                     ` [ltt-dev] " Mathieu Desnoyers
2009-01-21  6:17                       ` Ben Gamari
2009-01-22 22:59                   ` Mathieu Desnoyers
2009-01-23  3:21                 ` [ltt-dev] " KOSAKI Motohiro
2009-01-23  4:03                   ` Mathieu Desnoyers
2009-02-10  3:36                   ` [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O Mathieu Desnoyers
2009-02-10  3:55                     ` Nick Piggin
2009-02-10  5:23                     ` Linus Torvalds
2009-02-10  5:56                       ` Nick Piggin
2009-02-10  6:12                       ` Mathieu Desnoyers
2009-02-02  2:08               ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-02-02 11:26                 ` Jens Axboe [this message]
2009-02-03  0:46                   ` Mathieu Desnoyers
2009-01-20 13:45             ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 20:22             ` Ben Gamari
2009-01-20 22:23               ` Ben Gamari
2009-01-20 23:05                 ` Mathieu Desnoyers
2009-01-22  2:35               ` Ben Gamari
2009-01-19 15:45     ` Nikanth K
2009-01-19 18:23       ` Jens Axboe
2009-01-17 20:03   ` Ben Gamari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090202112600.GE30821@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltt-dev@lists.casi.polymtl.ca \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).