linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: akpm@linux-foundation.org, Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, ltt-dev@lists.casi.polymtl.ca
Subject: Re: [RFC PATCH] block: Fix bio merge induced high I/O latency
Date: Sun, 1 Feb 2009 21:08:11 -0500	[thread overview]
Message-ID: <20090202020811.GA19565@Krystal> (raw)
In-Reply-To: <20090120122855.GF30821@kernel.dk>

Hi Jens,

I tried your patch at

http://bugzilla.kernel.org/attachment.cgi?id=20001

On a 2.6.29-rc3 kernel. I get the following OOPS just after I start
running the fio test. It happens after a few

cfq: moving ffff88043d4b42e0 to dispatch                         
cfq: moving ffff88043d4b4170 to dispatch                        

messages (~20).

Here is the oops :

------------[ cut here ]------------
kernel BUG at block/cfq-iosched.c:650!
invalid opcode: 0000 [#1] PREEMPT SMP
LTT NESTING LEVEL : 0
last sysfs file: /sys/block/sda/stat
CPU 2
Modules linked in: loop ltt_tracer ltt_trace_control ltt_userspa]
Pid: 2934, comm: kjournald Not tainted 2.6.29-rc3 #3
RIP: 0010:[<ffffffff80419c2b>]  [<ffffffff80419c2b>] cfq_remove_0
RSP: 0018:ffff88043b167c20  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88043fd9e088 RCX: 0000000000000001
RDX: 0000000000000010 RSI: ffff88043887b590 RDI: ffff88043887b590
RBP: ffff88043b167c50 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88043fd9e088
R13: ffff88043887b590 R14: ffff88043fc40200 R15: ffff88043fd9e088
FS:  0000000000000000(0000) GS:ffff88043e81a080(0000) knlGS:00000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f2a5f98b8c0 CR3: 000000043e8c4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kjournald (pid: 2934, threadinfo ffff88043b166000, task )
Stack:
 000000000000003b ffff88043887b590 ffff88043fd9e088 ffff88043e5a0
 ffff88043fc40200 ffff88002809ed50 ffff88043b167c80 ffffffff8041d
 0000000000000001 ffff88043887b590 ffffe2001b805138 ffff88043e5a0
Call Trace:
 [<ffffffff80419e4d>] cfq_dispatch_insert+0x3d/0x70
 [<ffffffff80419f2f>] cfq_wait_on_page+0xaf/0xc0
 [<ffffffff804098ed>] elv_wait_on_page+0x1d/0x20
 [<ffffffff8040d207>] blk_backing_dev_wop+0x17/0x50
 [<ffffffff80301872>] sync_buffer+0x52/0x80
 [<ffffffff806a33b2>] __wait_on_bit+0x62/0x90
 [<ffffffff80301820>] ? sync_buffer+0x0/0x80
 [<ffffffff80301820>] ? sync_buffer+0x0/0x80
 [<ffffffff806a3459>] out_of_line_wait_on_bit+0x79/0x90
 [<ffffffff8025a8a0>] ? wake_bit_function+0x0/0x50
 [<ffffffff80301769>] __wait_on_buffer+0xf9/0x130
 [<ffffffff80379acd>] journal_commit_transaction+0x72d/0x1650
 [<ffffffff806a5c87>] ? _spin_unlock_irqrestore+0x47/0x80
 [<ffffffff8024dd2f>] ? try_to_del_timer_sync+0x5f/0x70
 [<ffffffff8037e488>] kjournald+0xe8/0x250
 [<ffffffff8025a860>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8037e3a0>] ? kjournald+0x0/0x250
 [<ffffffff8025a38e>] kthread+0x4e/0x90
 [<ffffffff8025a340>] ? kthread+0x0/0x90
 [<ffffffff8020db2a>] child_rip+0xa/0x20
 [<ffffffff8020d480>] ? restore_args+0x0/0x30
 [<ffffffff8025a340>] ? kthread+0x0/0x90
 [<ffffffff8020db20>] ? child_rip+0x0/0x20
Code: 4d 89 6d 00 49 8b 9d c0 00 00 00 41 8b 45 48 4c 8b 73 08 2
RIP  [<ffffffff80419c2b>] cfq_remove_request+0x6b/0x250
 RSP <ffff88043b167c20>
---[ end trace eab134a8bd405d05 ]---

It seems that the cfqq->queued[sync] counter should either be
incremented/decremented in the new cfq_wait_on_page, or that the fact
that the type of request (sync vs !sync) changes would not be taken care
of correctly. I have not looked at the code enough to find out exactly
what is happening, but I though you might have an idea of the cause.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  parent reply	other threads:[~2009-02-02  2:08 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-17  0:44 [Regression] High latency when doing large I/O Mathieu Desnoyers
2009-01-17 16:26 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Mathieu Desnoyers
2009-01-17 16:50   ` Leon Woestenberg
2009-01-17 17:15     ` Mathieu Desnoyers
2009-01-17 19:04   ` Jens Axboe
2009-01-18 21:12     ` Mathieu Desnoyers
2009-01-18 21:27       ` Mathieu Desnoyers
2009-01-19 18:26       ` Jens Axboe
2009-01-20  2:10         ` Mathieu Desnoyers
2009-01-20  7:37           ` Jens Axboe
2009-01-20 12:28             ` Jens Axboe
2009-01-20 14:22               ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 14:24                 ` Jens Axboe
2009-01-20 15:42                   ` Mathieu Desnoyers
2009-01-20 23:06                     ` Mathieu Desnoyers
2009-01-20 23:27               ` Mathieu Desnoyers
2009-01-21  0:25                 ` Mathieu Desnoyers
2009-01-21  4:38                   ` Ben Gamari
2009-01-21  4:54                     ` [ltt-dev] " Mathieu Desnoyers
2009-01-21  6:17                       ` Ben Gamari
2009-01-22 22:59                   ` Mathieu Desnoyers
2009-01-23  3:21                 ` [ltt-dev] " KOSAKI Motohiro
2009-01-23  4:03                   ` Mathieu Desnoyers
2009-02-10  3:36                   ` [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O Mathieu Desnoyers
2009-02-10  3:55                     ` Nick Piggin
2009-02-10  5:23                     ` Linus Torvalds
2009-02-10  5:56                       ` Nick Piggin
2009-02-10  6:12                       ` Mathieu Desnoyers
2009-02-02  2:08               ` Mathieu Desnoyers [this message]
2009-02-02 11:26                 ` [RFC PATCH] block: Fix bio merge induced high I/O latency Jens Axboe
2009-02-03  0:46                   ` Mathieu Desnoyers
2009-01-20 13:45             ` [ltt-dev] " Mathieu Desnoyers
2009-01-20 20:22             ` Ben Gamari
2009-01-20 22:23               ` Ben Gamari
2009-01-20 23:05                 ` Mathieu Desnoyers
2009-01-22  2:35               ` Ben Gamari
2009-01-19 15:45     ` Nikanth K
2009-01-19 18:23       ` Jens Axboe
2009-01-17 20:03   ` Ben Gamari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090202020811.GA19565@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=akpm@linux-foundation.org \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltt-dev@lists.casi.polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).