From: Jens Axboe <jens.axboe@oracle.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: "Vitaly V. Bursov" <vitalyb@telenet.dn.ua>, linux-kernel@vger.kernel.org
Subject: Re: Slow file transfer speeds with CFQ IO scheduler in some cases
Date: Tue, 11 Nov 2008 19:08:52 +0100 [thread overview]
Message-ID: <20081111180851.GD26778@kernel.dk> (raw)
In-Reply-To: <x49fxlyp5te.fsf@segfault.boston.devel.redhat.com>
On Tue, Nov 11 2008, Jeff Moyer wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
>
> > On Tue, Nov 11 2008, Jens Axboe wrote:
> >> On Tue, Nov 11 2008, Jens Axboe wrote:
> >> > On Mon, Nov 10 2008, Jeff Moyer wrote:
> >> > > "Vitaly V. Bursov" <vitalyb@telenet.dn.ua> writes:
> >> > >
> >> > > > Jens Axboe wrote:
> >> > > >> On Mon, Nov 10 2008, Vitaly V. Bursov wrote:
> >> > > >>> Jens Axboe wrote:
> >> > > >>>> On Mon, Nov 10 2008, Jeff Moyer wrote:
> >> > > >>>>> Jens Axboe <jens.axboe@oracle.com> writes:
> >> > > >>>>>
> >> > > >>>>>> http://bugzilla.kernel.org/attachment.cgi?id=18473&action=view
> >> > > >>>>> Funny, I was going to ask the same question. ;) The reason Jens wants
> >> > > >>>>> you to try this patch is that nfsd may be farming off the I/O requests
> >> > > >>>>> to different threads which are then performing interleaved I/O. The
> >> > > >>>>> above patch tries to detect this and allow cooperating processes to get
> >> > > >>>>> disk time instead of waiting for the idle timeout.
> >> > > >>>> Precisely :-)
> >> > > >>>>
> >> > > >>>> The only reason I haven't merged it yet is because of worry of extra
> >> > > >>>> cost, but I'll throw some SSD love at it and see how it turns out.
> >> > > >>>>
> >> > > >>> Sorry, but I get "oops" same moment nfs read transfer starts.
> >> > > >>> I can get directory list via nfs, read files locally (not
> >> > > >>> carefully tested, though)
> >> > > >>>
> >> > > >>> Dumps captured via netconsole, so these may not be completely accurate
> >> > > >>> but hopefully will give a hint.
> >> > > >>
> >> > > >> Interesting, strange how that hasn't triggered here. Or perhaps the
> >> > > >> version that Jeff posted isn't the one I tried. Anyway, search for:
> >> > > >>
> >> > > >> RB_CLEAR_NODE(&cfqq->rb_node);
> >> > > >>
> >> > > >> and add a
> >> > > >>
> >> > > >> RB_CLEAR_NODE(&cfqq->prio_node);
> >> > > >>
> >> > > >> just below that. It's in cfq_find_alloc_queue(). I think that should fix
> >> > > >> it.
> >> > > >>
> >> > > >
> >> > > > Same problem.
> >> > > >
> >> > > > I did make clean; make -j3; sync; on (2 times) patched kernel and it went OK
> >> > > > but It won't boot anymore with cfq with same error...
> >> > > >
> >> > > > Switching cfq io scheduler at runtime (booting with "as") appears to work with
> >> > > > two parallel local dd reads.
> >> > >
> >> > > Strange, I can't reproduce a failure. I'll keep trying. For now, these
> >> > > are the results I see:
> >> > >
> >> > > [root@maiden ~]# mount megadeth:/export/cciss /mnt/megadeth/
> >> > > [root@maiden ~]# dd if=/mnt/megadeth/file1 of=/dev/null bs=1M
> >> > > 1024+0 records in
> >> > > 1024+0 records out
> >> > > 1073741824 bytes (1.1 GB) copied, 26.8128 s, 40.0 MB/s
> >> > > [root@maiden ~]# umount /mnt/megadeth/
> >> > > [root@maiden ~]# mount megadeth:/export/cciss /mnt/megadeth/
> >> > > [root@maiden ~]# dd if=/mnt/megadeth/file1 of=/dev/null bs=1M
> >> > > 1024+0 records in
> >> > > 1024+0 records out
> >> > > 1073741824 bytes (1.1 GB) copied, 23.7025 s, 45.3 MB/s
> >> > > [root@maiden ~]# umount /mnt/megadeth/
> >> > >
> >> > > Here is the patch, with the suggestion from Jens to switch the cfqq to
> >> > > the right priority tree when the priority is changed.
> >> >
> >> > I don't see the issue here either. Vitaly, are you using any openvz
> >> > kernel patches? IIRC, they patch cfq so it could just be that your cfq
> >> > version is incompatible with Jeff's patch.
> >>
> >> Heh, got it to trigger about 3 seconds after sending that email! I'll
> >> look more into it.
> >
> > OK, found the issue. A few bugs there... cfq_prio_tree_lookup() doesn't
> > even return a hit, since it just breaks and returns NULL always. That
> > can cause cfq_prio_tree_add() to screw up the rbtree. The code to
> > correct on ioprio change wasn't correct either, I changed that as well.
> > New patch below, Vitaly can you give it a spin?
>
> Thanks for doing that! Yeah, that was a stupid bug with the lookup
> routine. I don't know that I agree with you that the ioprio change code
> was wrong. I looked at all of the callers and that seemed the code path
> that was used for I/O priority *changes*. The initial creation was
> already okay, wasn't it?
You only did it in cfq_prio_boost(), you should go one down and do it
for all prio changes. cfq_init_prio_data() gets called to fix state up
lazily when it notices a prio change, either due to prio boost or
because someone ran ionice.
--
Jens Axboe
next prev parent reply other threads:[~2008-11-11 18:10 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-09 18:04 Slow file transfer speeds with CFQ IO scheduler in some cases Vitaly V. Bursov
2008-11-09 18:30 ` Alexey Dobriyan
2008-11-09 18:32 ` Vitaly V. Bursov
2008-11-10 10:44 ` Jens Axboe
2008-11-10 13:51 ` Jeff Moyer
2008-11-10 13:56 ` Jens Axboe
2008-11-10 17:16 ` Vitaly V. Bursov
2008-11-10 17:35 ` Jens Axboe
2008-11-10 18:27 ` Vitaly V. Bursov
2008-11-10 18:29 ` Jens Axboe
2008-11-10 18:39 ` Jeff Moyer
2008-11-10 18:42 ` Jens Axboe
2008-11-10 21:51 ` Jeff Moyer
2008-11-11 9:34 ` Jens Axboe
2008-11-11 9:35 ` Jens Axboe
2008-11-11 11:52 ` Jens Axboe
2008-11-11 16:48 ` Jeff Moyer
2008-11-11 18:08 ` Jens Axboe [this message]
2008-11-11 16:53 ` Vitaly V. Bursov
2008-11-11 18:06 ` Jens Axboe
2008-11-11 19:36 ` Jeff Moyer
2008-11-11 21:41 ` Jeff Layton
2008-11-11 21:59 ` Jeff Layton
2008-11-12 12:20 ` Jens Axboe
2008-11-12 12:45 ` Jeff Layton
2008-11-12 12:54 ` Christoph Hellwig
2008-11-11 19:42 ` Vitaly V. Bursov
2008-11-12 18:32 ` Jeff Moyer
2008-11-12 19:02 ` Jens Axboe
2008-11-13 8:51 ` Wu Fengguang
2008-11-13 8:54 ` Jens Axboe
2008-11-14 1:36 ` Wu Fengguang
2008-11-25 11:02 ` Vladislav Bolkhovitin
2008-11-25 11:25 ` Wu Fengguang
2008-11-25 15:21 ` Jeff Moyer
2008-11-25 16:17 ` Vladislav Bolkhovitin
2008-11-13 18:46 ` Vitaly V. Bursov
2008-11-25 10:59 ` Vladislav Bolkhovitin
2008-11-25 11:30 ` Wu Fengguang
2008-11-25 11:41 ` Vladislav Bolkhovitin
2008-11-25 11:49 ` Wu Fengguang
2008-11-25 12:03 ` Vladislav Bolkhovitin
2008-11-25 12:09 ` Vladislav Bolkhovitin
2008-11-25 12:15 ` Wu Fengguang
2008-11-27 17:46 ` Vladislav Bolkhovitin
[not found] ` <492EDCFB.7080302-d+Crzxg7Rs0@public.gmane.org>
2008-11-28 0:48 ` Wu Fengguang
2008-11-28 0:48 ` Wu Fengguang
2009-02-12 18:35 ` Vladislav Bolkhovitin
2009-02-13 1:57 ` Wu Fengguang
2009-02-13 20:08 ` Vladislav Bolkhovitin
2009-02-13 20:08 ` Vladislav Bolkhovitin
[not found] ` <4995D339.5050502-d+Crzxg7Rs0@public.gmane.org>
2009-02-16 2:34 ` Wu Fengguang
2009-02-16 2:34 ` Wu Fengguang
2009-02-17 19:03 ` Vladislav Bolkhovitin
2009-02-17 19:03 ` Vladislav Bolkhovitin
2009-02-18 18:14 ` Vladislav Bolkhovitin
2009-02-19 1:35 ` Wu Fengguang
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-19 2:05 ` Wu Fengguang
2009-03-19 17:44 ` Vladislav Bolkhovitin
2009-03-20 8:53 ` Vladislav Bolkhovitin
2009-03-23 1:42 ` Wu Fengguang
2009-04-21 18:18 ` Vladislav Bolkhovitin
2009-04-24 8:43 ` Wu Fengguang
2009-05-12 18:13 ` Vladislav Bolkhovitin
[not found] ` <49946BE6.1040005-d+Crzxg7Rs0@public.gmane.org>
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-17 19:01 ` Vladislav Bolkhovitin
[not found] ` <499B0979.8050006-d+Crzxg7Rs0@public.gmane.org>
2009-02-19 1:38 ` Wu Fengguang
2009-02-19 1:38 ` Wu Fengguang
2008-11-24 15:33 ` Jeff Moyer
2008-11-24 18:13 ` Jens Axboe
2008-11-24 18:50 ` Jeff Moyer
2008-11-24 18:51 ` Jens Axboe
2008-11-13 6:54 ` Vitaly V. Bursov
2008-11-13 14:32 ` Jeff Moyer
2008-11-13 18:33 ` Vitaly V. Bursov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081111180851.GD26778@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=vitalyb@telenet.dn.ua \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.