From: Jens Axboe <jens.axboe@oracle.com>
To: "Vitaly V. Bursov" <vitalyb@telenet.dn.ua>
Cc: Jeff Moyer <jmoyer@redhat.com>, linux-kernel@vger.kernel.org
Subject: Re: Slow file transfer speeds with CFQ IO scheduler in some cases
Date: Mon, 10 Nov 2008 19:42:42 +0100 [thread overview]
Message-ID: <20081110184241.GO26778@kernel.dk> (raw)
In-Reply-To: <20081110182928.GN26778@kernel.dk>
On Mon, Nov 10 2008, Jens Axboe wrote:
> On Mon, Nov 10 2008, Vitaly V. Bursov wrote:
> > Jens Axboe wrote:
> > > On Mon, Nov 10 2008, Vitaly V. Bursov wrote:
> > >> Jens Axboe wrote:
> > >>> On Mon, Nov 10 2008, Jeff Moyer wrote:
> > >>>> Jens Axboe <jens.axboe@oracle.com> writes:
> > >>>>
> > >>>>> http://bugzilla.kernel.org/attachment.cgi?id=18473&action=view
> > >>>> Funny, I was going to ask the same question. ;) The reason Jens wants
> > >>>> you to try this patch is that nfsd may be farming off the I/O requests
> > >>>> to different threads which are then performing interleaved I/O. The
> > >>>> above patch tries to detect this and allow cooperating processes to get
> > >>>> disk time instead of waiting for the idle timeout.
> > >>> Precisely :-)
> > >>>
> > >>> The only reason I haven't merged it yet is because of worry of extra
> > >>> cost, but I'll throw some SSD love at it and see how it turns out.
> > >>>
> > >> Sorry, but I get "oops" same moment nfs read transfer starts.
> > >> I can get directory list via nfs, read files locally (not
> > >> carefully tested, though)
> > >>
> > >> Dumps captured via netconsole, so these may not be completely accurate
> > >> but hopefully will give a hint.
> > >
> > > Interesting, strange how that hasn't triggered here. Or perhaps the
> > > version that Jeff posted isn't the one I tried. Anyway, search for:
> > >
> > > RB_CLEAR_NODE(&cfqq->rb_node);
> > >
> > > and add a
> > >
> > > RB_CLEAR_NODE(&cfqq->prio_node);
> > >
> > > just below that. It's in cfq_find_alloc_queue(). I think that should fix
> > > it.
> > >
> >
> > Same problem.
> >
> > I did make clean; make -j3; sync; on (2 times) patched kernel and it went OK
> > but It won't boot anymore with cfq with same error...
> >
> > Switching cfq io scheduler at runtime (booting with "as") appears to work with
> > two parallel local dd reads.
> >
> > But when NFS server starts up:
> >
> > [ 469.000105] BUG: unable to handle kernel
> > NULL pointer dereference
> > at 0000000000000000
> > [ 469.000305] IP:
> > [<ffffffff81111f2a>] rb_erase+0x124/0x290
> > ...
> >
> > [ 469.001905] Pid: 2296, comm: md1_raid5 Not tainted 2.6.27.5 #4
> > [ 469.001982] RIP: 0010:[<ffffffff81111f2a>]
> > [<ffffffff81111f2a>] rb_erase+0x124/0x290
> > ...
> > [ 469.002509] Call Trace:
> > [ 469.002509] [<ffffffff8110a0b9>] ? rb_erase_init+0x9/0x17
> > [ 469.002509] [<ffffffff8110a0ff>] ? cfq_prio_tree_add+0x38/0xa8
> > [ 469.002509] [<ffffffff8110b13d>] ? cfq_add_rq_rb+0xb5/0xc8
> > [ 469.002509] [<ffffffff8110b1aa>] ? cfq_insert_request+0x5a/0x356
> > [ 469.002509] [<ffffffff811000a1>] ? elv_insert+0x14b/0x218
> > [ 469.002509] [<ffffffff810ab757>] ? bio_phys_segments+0xf/0x15
> > [ 469.002509] [<ffffffff811028dc>] ? __make_request+0x3b9/0x3eb
> > [ 469.002509] [<ffffffff8110120c>] ? generic_make_request+0x30b/0x346
> > [ 469.002509] [<ffffffff811baaf4>] ? raid5_end_write_request+0x0/0xb8
> > [ 469.002509] [<ffffffff811b8ade>] ? ops_run_io+0x16a/0x1c1
> > [ 469.002509] [<ffffffff811ba534>] ? handle_stripe5+0x9b5/0x9d6
> > [ 469.002509] [<ffffffff811bbf08>] ? handle_stripe+0xc3a/0xc6a
> > [ 469.002509] [<ffffffff810296e5>] ? pick_next_task_fair+0x8d/0x9c
> > [ 469.002509] [<ffffffff81253792>] ? thread_return+0x3a/0xaa
> > [ 469.002509] [<ffffffff811bc2ce>] ? raid5d+0x396/0x3cd
> > [ 469.002509] [<ffffffff81253bd8>] ? schedule_timeout+0x1e/0xad
> > [ 469.002509] [<ffffffff811c716f>] ? md_thread+0xdd/0xf9
> > [ 469.002509] [<ffffffff81044f9c>] ? autoremove_wake_function+0x0/0x2e
> > [ 469.002509] [<ffffffff811c7092>] ? md_thread+0x0/0xf9
> > [ 469.002509] [<ffffffff81044e80>] ? kthread+0x47/0x73
> > [ 469.002509] [<ffffffff8102f867>] ? schedule_tail+0x28/0x60
> > [ 469.002509] [<ffffffff8100cda9>] ? child_rip+0xa/0x11
> > [ 469.002509] [<ffffffff81044e39>] ? kthread+0x0/0x73
> > [ 469.002509] [<ffffffff8100cd9f>] ? child_rip+0x0/0x11
> > ...
> > [ 469.002509] RIP
> > [<ffffffff81111f2a>] rb_erase+0x124/0x290
> > [ 469.002509] RSP <ffff88011d4c7a58>
> > [ 469.002509] CR2: 0000000000000000
> > [ 469.002509] ---[ end trace acdef779aeb56048 ]---
> >
> >
> > "Best" result I got with NFS was
> > avg-cpu: %user %nice %system %iowait %steal %idle
> > 0,00 0,00 0,20 0,65 0,00 99,15
> >
> > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
> > sda 11,30 0,00 7,60 0,00 245,60 0,00 32,32 0,01 1,18 0,79 0,60
> > sdb 12,10 0,00 8,00 0,00 246,40 0,00 30,80 0,01 1,62 0,62 0,50
> >
> > and it lasted around 30 seconds.
>
> OK, I'll throw some NFS at this patch in the morning and do some
> measurements as well, so it can get queued up.
I spotted a bug - if the ioprio of a process gets changed, it needs to
be repositioned in the cooperator tree or we'll end up doing erase on a
wrong root. Perhaps that is what is biting you here.
--
Jens Axboe
next prev parent reply other threads:[~2008-11-10 18:44 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-09 18:04 Slow file transfer speeds with CFQ IO scheduler in some cases Vitaly V. Bursov
2008-11-09 18:30 ` Alexey Dobriyan
2008-11-09 18:32 ` Vitaly V. Bursov
2008-11-10 10:44 ` Jens Axboe
2008-11-10 13:51 ` Jeff Moyer
2008-11-10 13:56 ` Jens Axboe
2008-11-10 17:16 ` Vitaly V. Bursov
2008-11-10 17:35 ` Jens Axboe
2008-11-10 18:27 ` Vitaly V. Bursov
2008-11-10 18:29 ` Jens Axboe
2008-11-10 18:39 ` Jeff Moyer
2008-11-10 18:42 ` Jens Axboe [this message]
2008-11-10 21:51 ` Jeff Moyer
2008-11-11 9:34 ` Jens Axboe
2008-11-11 9:35 ` Jens Axboe
2008-11-11 11:52 ` Jens Axboe
2008-11-11 16:48 ` Jeff Moyer
2008-11-11 18:08 ` Jens Axboe
2008-11-11 16:53 ` Vitaly V. Bursov
2008-11-11 18:06 ` Jens Axboe
2008-11-11 19:36 ` Jeff Moyer
2008-11-11 21:41 ` Jeff Layton
2008-11-11 21:59 ` Jeff Layton
2008-11-12 12:20 ` Jens Axboe
2008-11-12 12:45 ` Jeff Layton
2008-11-12 12:54 ` Christoph Hellwig
2008-11-11 19:42 ` Vitaly V. Bursov
2008-11-12 18:32 ` Jeff Moyer
2008-11-12 19:02 ` Jens Axboe
2008-11-13 8:51 ` Wu Fengguang
2008-11-13 8:54 ` Jens Axboe
2008-11-14 1:36 ` Wu Fengguang
2008-11-25 11:02 ` Vladislav Bolkhovitin
2008-11-25 11:25 ` Wu Fengguang
2008-11-25 15:21 ` Jeff Moyer
2008-11-25 16:17 ` Vladislav Bolkhovitin
2008-11-13 18:46 ` Vitaly V. Bursov
2008-11-25 10:59 ` Vladislav Bolkhovitin
2008-11-25 11:30 ` Wu Fengguang
2008-11-25 11:41 ` Vladislav Bolkhovitin
2008-11-25 11:49 ` Wu Fengguang
2008-11-25 12:03 ` Vladislav Bolkhovitin
2008-11-25 12:09 ` Vladislav Bolkhovitin
2008-11-25 12:15 ` Wu Fengguang
2008-11-27 17:46 ` Vladislav Bolkhovitin
[not found] ` <492EDCFB.7080302-d+Crzxg7Rs0@public.gmane.org>
2008-11-28 0:48 ` Wu Fengguang
2008-11-28 0:48 ` Wu Fengguang
2009-02-12 18:35 ` Vladislav Bolkhovitin
2009-02-13 1:57 ` Wu Fengguang
2009-02-13 20:08 ` Vladislav Bolkhovitin
2009-02-13 20:08 ` Vladislav Bolkhovitin
[not found] ` <4995D339.5050502-d+Crzxg7Rs0@public.gmane.org>
2009-02-16 2:34 ` Wu Fengguang
2009-02-16 2:34 ` Wu Fengguang
2009-02-17 19:03 ` Vladislav Bolkhovitin
2009-02-17 19:03 ` Vladislav Bolkhovitin
2009-02-18 18:14 ` Vladislav Bolkhovitin
2009-02-19 1:35 ` Wu Fengguang
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-19 2:05 ` Wu Fengguang
2009-03-19 17:44 ` Vladislav Bolkhovitin
2009-03-20 8:53 ` Vladislav Bolkhovitin
2009-03-23 1:42 ` Wu Fengguang
2009-04-21 18:18 ` Vladislav Bolkhovitin
2009-04-24 8:43 ` Wu Fengguang
2009-05-12 18:13 ` Vladislav Bolkhovitin
[not found] ` <49946BE6.1040005-d+Crzxg7Rs0@public.gmane.org>
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-17 19:01 ` Vladislav Bolkhovitin
[not found] ` <499B0979.8050006-d+Crzxg7Rs0@public.gmane.org>
2009-02-19 1:38 ` Wu Fengguang
2009-02-19 1:38 ` Wu Fengguang
2008-11-24 15:33 ` Jeff Moyer
2008-11-24 18:13 ` Jens Axboe
2008-11-24 18:50 ` Jeff Moyer
2008-11-24 18:51 ` Jens Axboe
2008-11-13 6:54 ` Vitaly V. Bursov
2008-11-13 14:32 ` Jeff Moyer
2008-11-13 18:33 ` Vitaly V. Bursov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081110184241.GO26778@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=vitalyb@telenet.dn.ua \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.