From: Jens Axboe <jens.axboe@oracle.com>
To: "Vitaly V. Bursov" <vitalyb@telenet.dn.ua>
Cc: Jeff Moyer <jmoyer@redhat.com>, linux-kernel@vger.kernel.org
Subject: Re: Slow file transfer speeds with CFQ IO scheduler in some cases
Date: Mon, 10 Nov 2008 19:29:28 +0100 [thread overview]
Message-ID: <20081110182928.GN26778@kernel.dk> (raw)
In-Reply-To: <49187D05.9050407@telenet.dn.ua>
On Mon, Nov 10 2008, Vitaly V. Bursov wrote:
> Jens Axboe wrote:
> > On Mon, Nov 10 2008, Vitaly V. Bursov wrote:
> >> Jens Axboe wrote:
> >>> On Mon, Nov 10 2008, Jeff Moyer wrote:
> >>>> Jens Axboe <jens.axboe@oracle.com> writes:
> >>>>
> >>>>> http://bugzilla.kernel.org/attachment.cgi?id=18473&action=view
> >>>> Funny, I was going to ask the same question. ;) The reason Jens wants
> >>>> you to try this patch is that nfsd may be farming off the I/O requests
> >>>> to different threads which are then performing interleaved I/O. The
> >>>> above patch tries to detect this and allow cooperating processes to get
> >>>> disk time instead of waiting for the idle timeout.
> >>> Precisely :-)
> >>>
> >>> The only reason I haven't merged it yet is because of worry of extra
> >>> cost, but I'll throw some SSD love at it and see how it turns out.
> >>>
> >> Sorry, but I get "oops" same moment nfs read transfer starts.
> >> I can get directory list via nfs, read files locally (not
> >> carefully tested, though)
> >>
> >> Dumps captured via netconsole, so these may not be completely accurate
> >> but hopefully will give a hint.
> >
> > Interesting, strange how that hasn't triggered here. Or perhaps the
> > version that Jeff posted isn't the one I tried. Anyway, search for:
> >
> > RB_CLEAR_NODE(&cfqq->rb_node);
> >
> > and add a
> >
> > RB_CLEAR_NODE(&cfqq->prio_node);
> >
> > just below that. It's in cfq_find_alloc_queue(). I think that should fix
> > it.
> >
>
> Same problem.
>
> I did make clean; make -j3; sync; on (2 times) patched kernel and it went OK
> but It won't boot anymore with cfq with same error...
>
> Switching cfq io scheduler at runtime (booting with "as") appears to work with
> two parallel local dd reads.
>
> But when NFS server starts up:
>
> [ 469.000105] BUG: unable to handle kernel
> NULL pointer dereference
> at 0000000000000000
> [ 469.000305] IP:
> [<ffffffff81111f2a>] rb_erase+0x124/0x290
> ...
>
> [ 469.001905] Pid: 2296, comm: md1_raid5 Not tainted 2.6.27.5 #4
> [ 469.001982] RIP: 0010:[<ffffffff81111f2a>]
> [<ffffffff81111f2a>] rb_erase+0x124/0x290
> ...
> [ 469.002509] Call Trace:
> [ 469.002509] [<ffffffff8110a0b9>] ? rb_erase_init+0x9/0x17
> [ 469.002509] [<ffffffff8110a0ff>] ? cfq_prio_tree_add+0x38/0xa8
> [ 469.002509] [<ffffffff8110b13d>] ? cfq_add_rq_rb+0xb5/0xc8
> [ 469.002509] [<ffffffff8110b1aa>] ? cfq_insert_request+0x5a/0x356
> [ 469.002509] [<ffffffff811000a1>] ? elv_insert+0x14b/0x218
> [ 469.002509] [<ffffffff810ab757>] ? bio_phys_segments+0xf/0x15
> [ 469.002509] [<ffffffff811028dc>] ? __make_request+0x3b9/0x3eb
> [ 469.002509] [<ffffffff8110120c>] ? generic_make_request+0x30b/0x346
> [ 469.002509] [<ffffffff811baaf4>] ? raid5_end_write_request+0x0/0xb8
> [ 469.002509] [<ffffffff811b8ade>] ? ops_run_io+0x16a/0x1c1
> [ 469.002509] [<ffffffff811ba534>] ? handle_stripe5+0x9b5/0x9d6
> [ 469.002509] [<ffffffff811bbf08>] ? handle_stripe+0xc3a/0xc6a
> [ 469.002509] [<ffffffff810296e5>] ? pick_next_task_fair+0x8d/0x9c
> [ 469.002509] [<ffffffff81253792>] ? thread_return+0x3a/0xaa
> [ 469.002509] [<ffffffff811bc2ce>] ? raid5d+0x396/0x3cd
> [ 469.002509] [<ffffffff81253bd8>] ? schedule_timeout+0x1e/0xad
> [ 469.002509] [<ffffffff811c716f>] ? md_thread+0xdd/0xf9
> [ 469.002509] [<ffffffff81044f9c>] ? autoremove_wake_function+0x0/0x2e
> [ 469.002509] [<ffffffff811c7092>] ? md_thread+0x0/0xf9
> [ 469.002509] [<ffffffff81044e80>] ? kthread+0x47/0x73
> [ 469.002509] [<ffffffff8102f867>] ? schedule_tail+0x28/0x60
> [ 469.002509] [<ffffffff8100cda9>] ? child_rip+0xa/0x11
> [ 469.002509] [<ffffffff81044e39>] ? kthread+0x0/0x73
> [ 469.002509] [<ffffffff8100cd9f>] ? child_rip+0x0/0x11
> ...
> [ 469.002509] RIP
> [<ffffffff81111f2a>] rb_erase+0x124/0x290
> [ 469.002509] RSP <ffff88011d4c7a58>
> [ 469.002509] CR2: 0000000000000000
> [ 469.002509] ---[ end trace acdef779aeb56048 ]---
>
>
> "Best" result I got with NFS was
> avg-cpu: %user %nice %system %iowait %steal %idle
> 0,00 0,00 0,20 0,65 0,00 99,15
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
> sda 11,30 0,00 7,60 0,00 245,60 0,00 32,32 0,01 1,18 0,79 0,60
> sdb 12,10 0,00 8,00 0,00 246,40 0,00 30,80 0,01 1,62 0,62 0,50
>
> and it lasted around 30 seconds.
OK, I'll throw some NFS at this patch in the morning and do some
measurements as well, so it can get queued up.
--
Jens Axboe
next prev parent reply other threads:[~2008-11-10 18:31 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-09 18:04 Slow file transfer speeds with CFQ IO scheduler in some cases Vitaly V. Bursov
2008-11-09 18:30 ` Alexey Dobriyan
2008-11-09 18:32 ` Vitaly V. Bursov
2008-11-10 10:44 ` Jens Axboe
2008-11-10 13:51 ` Jeff Moyer
2008-11-10 13:56 ` Jens Axboe
2008-11-10 17:16 ` Vitaly V. Bursov
2008-11-10 17:35 ` Jens Axboe
2008-11-10 18:27 ` Vitaly V. Bursov
2008-11-10 18:29 ` Jens Axboe [this message]
2008-11-10 18:39 ` Jeff Moyer
2008-11-10 18:42 ` Jens Axboe
2008-11-10 21:51 ` Jeff Moyer
2008-11-11 9:34 ` Jens Axboe
2008-11-11 9:35 ` Jens Axboe
2008-11-11 11:52 ` Jens Axboe
2008-11-11 16:48 ` Jeff Moyer
2008-11-11 18:08 ` Jens Axboe
2008-11-11 16:53 ` Vitaly V. Bursov
2008-11-11 18:06 ` Jens Axboe
2008-11-11 19:36 ` Jeff Moyer
2008-11-11 21:41 ` Jeff Layton
2008-11-11 21:59 ` Jeff Layton
2008-11-12 12:20 ` Jens Axboe
2008-11-12 12:45 ` Jeff Layton
2008-11-12 12:54 ` Christoph Hellwig
2008-11-11 19:42 ` Vitaly V. Bursov
2008-11-12 18:32 ` Jeff Moyer
2008-11-12 19:02 ` Jens Axboe
2008-11-13 8:51 ` Wu Fengguang
2008-11-13 8:54 ` Jens Axboe
2008-11-14 1:36 ` Wu Fengguang
2008-11-25 11:02 ` Vladislav Bolkhovitin
2008-11-25 11:25 ` Wu Fengguang
2008-11-25 15:21 ` Jeff Moyer
2008-11-25 16:17 ` Vladislav Bolkhovitin
2008-11-13 18:46 ` Vitaly V. Bursov
2008-11-25 10:59 ` Vladislav Bolkhovitin
2008-11-25 11:30 ` Wu Fengguang
2008-11-25 11:41 ` Vladislav Bolkhovitin
2008-11-25 11:49 ` Wu Fengguang
2008-11-25 12:03 ` Vladislav Bolkhovitin
2008-11-25 12:09 ` Vladislav Bolkhovitin
2008-11-25 12:15 ` Wu Fengguang
2008-11-27 17:46 ` Vladislav Bolkhovitin
2008-11-28 0:48 ` Wu Fengguang
2009-02-12 18:35 ` Vladislav Bolkhovitin
2009-02-13 1:57 ` Wu Fengguang
2009-02-13 20:08 ` Vladislav Bolkhovitin
2009-02-16 2:34 ` Wu Fengguang
2009-02-17 19:03 ` Vladislav Bolkhovitin
2009-02-18 18:14 ` Vladislav Bolkhovitin
2009-02-19 1:35 ` Wu Fengguang
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-19 2:05 ` Wu Fengguang
2009-03-19 17:44 ` Vladislav Bolkhovitin
2009-03-20 8:53 ` Vladislav Bolkhovitin
2009-03-23 1:42 ` Wu Fengguang
2009-04-21 18:18 ` Vladislav Bolkhovitin
2009-04-24 8:43 ` Wu Fengguang
2009-05-12 18:13 ` Vladislav Bolkhovitin
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-19 1:38 ` Wu Fengguang
2008-11-24 15:33 ` Jeff Moyer
2008-11-24 18:13 ` Jens Axboe
2008-11-24 18:50 ` Jeff Moyer
2008-11-24 18:51 ` Jens Axboe
2008-11-13 6:54 ` Vitaly V. Bursov
2008-11-13 14:32 ` Jeff Moyer
2008-11-13 18:33 ` Vitaly V. Bursov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081110182928.GN26778@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=vitalyb@telenet.dn.ua \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox