From: Vladislav Bolkhovitin <vst@vlnb.net>
To: Wu Fengguang <wfg@linux.intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>,
Jeff Moyer <jmoyer@redhat.com>,
"Vitaly V. Bursov" <vitalyb@telenet.dn.ua>,
linux-kernel@vger.kernel.org
Subject: Re: Slow file transfer speeds with CFQ IO scheduler in some cases
Date: Tue, 25 Nov 2008 14:02:45 +0300 [thread overview]
Message-ID: <492BDB55.3050407@vlnb.net> (raw)
In-Reply-To: <1226626590.681364.9398@de>
Wu Fengguang wrote:
> On Thu, Nov 13, 2008 at 09:54:39AM +0100, Jens Axboe wrote:
>> On Thu, Nov 13 2008, Wu Fengguang wrote:
>>> Hi all,
>>>
>>> //Sorry for being late.
>>>
>>> On Wed, Nov 12, 2008 at 08:02:28PM +0100, Jens Axboe wrote:
>>> [...]
>>>> I already talked about this with Jeff on irc, but I guess should post it
>>>> here as well.
>>>>
>>>> nfsd aside (which does seem to have some different behaviour skewing the
>>>> results), the original patch came about because dump(8) has a really
>>>> stupid design that offloads IO to a number of processes. This basically
>>>> makes fairly sequential IO more random with CFQ, since each process gets
>>>> its own io context. My feeling is that we should fix dump instead of
>>>> introducing a fair bit of complexity (and slowdown) in CFQ. I'm not
>>>> aware of any other good programs out there that would do something
>>>> similar, so I don't think there's a lot of merrit to spending cycles on
>>>> detecting cooperating processes.
>>>>
>>>> Jeff will take a look at fixing dump instead, and I may have promised
>>>> him that santa will bring him something nice this year if he does (since
>>>> I'm sure it'll be painful on the eyes).
>>> This could also be fixed at the VFS readahead level.
>>>
>>> In fact I've seen many kinds of interleaved accesses:
>>> - concurrently reading 40 files that are in fact hard links of one single file
>>> - a backup tool that splits a big file into 8k chunks, and serve the
>>> {1, 3, 5, 7, ...} chunks in one process and the {0, 2, 4, 6, ...}
>>> chunks in another one
>>> - a pool of NFSDs randomly serving some originally sequential read requests
>>> - now dump(8) seems to have some similar problem.
>>>
>>> In summary there have been all kinds of efforts on trying to
>>> parallelize I/O tasks, but unfortunately they can easily screw up the
>>> sequential pattern. It may not be easily fixable for many of them.
>>>
>>> It is however possible to detect most of these patterns at the
>>> readahead layer and restore sequential I/Os, before they propagate
>>> into the block layer and hurt performance.
>>>
>>> Vitaly, if that's what you need, I can try to prepare a patch for
>>> testing out.
>> It's not easy. To really fix it, you have to get that sequential RA
>> pattern from just the single process. As soon as you spread the IO
>> between processes (eg N-1 aren't just getting cache hits), then you may
>> run into trouble on the IO scheduler side.
>
> Yes, it's not easy(or possible) to tell from file->f_ra all those
> cooperative processes working on the same sequential stream, since
> they will have different file->f_ra instances. In the case of NFSD,
> the file->f_ra may well be all zeros.
>
> Another scheme is to detect the sequential pattern via looking up
> the page cache, which provides one single and consistent view of the
> pages recently accessed. That makes sequential detection possible.
>
> The cost will be one extra page cache lookup per random read.
> If it's not acceptable, the corresponding code could be disabled
> by default.
I think, this should be the best and the simplest way to go. Since in
most case data from the cache should be later copied to user, one more
page cache lookup should be negligible.
> Thanks,
> Fengguang
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2008-11-25 11:08 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-09 18:04 Slow file transfer speeds with CFQ IO scheduler in some cases Vitaly V. Bursov
2008-11-09 18:30 ` Alexey Dobriyan
2008-11-09 18:32 ` Vitaly V. Bursov
2008-11-10 10:44 ` Jens Axboe
2008-11-10 13:51 ` Jeff Moyer
2008-11-10 13:56 ` Jens Axboe
2008-11-10 17:16 ` Vitaly V. Bursov
2008-11-10 17:35 ` Jens Axboe
2008-11-10 18:27 ` Vitaly V. Bursov
2008-11-10 18:29 ` Jens Axboe
2008-11-10 18:39 ` Jeff Moyer
2008-11-10 18:42 ` Jens Axboe
2008-11-10 21:51 ` Jeff Moyer
2008-11-11 9:34 ` Jens Axboe
2008-11-11 9:35 ` Jens Axboe
2008-11-11 11:52 ` Jens Axboe
2008-11-11 16:48 ` Jeff Moyer
2008-11-11 18:08 ` Jens Axboe
2008-11-11 16:53 ` Vitaly V. Bursov
2008-11-11 18:06 ` Jens Axboe
2008-11-11 19:36 ` Jeff Moyer
2008-11-11 21:41 ` Jeff Layton
2008-11-11 21:59 ` Jeff Layton
2008-11-12 12:20 ` Jens Axboe
2008-11-12 12:45 ` Jeff Layton
2008-11-12 12:54 ` Christoph Hellwig
2008-11-11 19:42 ` Vitaly V. Bursov
2008-11-12 18:32 ` Jeff Moyer
2008-11-12 19:02 ` Jens Axboe
2008-11-13 8:51 ` Wu Fengguang
2008-11-13 8:54 ` Jens Axboe
2008-11-14 1:36 ` Wu Fengguang
2008-11-25 11:02 ` Vladislav Bolkhovitin [this message]
2008-11-25 11:25 ` Wu Fengguang
2008-11-25 15:21 ` Jeff Moyer
2008-11-25 16:17 ` Vladislav Bolkhovitin
2008-11-13 18:46 ` Vitaly V. Bursov
2008-11-25 10:59 ` Vladislav Bolkhovitin
2008-11-25 11:30 ` Wu Fengguang
2008-11-25 11:41 ` Vladislav Bolkhovitin
2008-11-25 11:49 ` Wu Fengguang
2008-11-25 12:03 ` Vladislav Bolkhovitin
2008-11-25 12:09 ` Vladislav Bolkhovitin
2008-11-25 12:15 ` Wu Fengguang
2008-11-27 17:46 ` Vladislav Bolkhovitin
2008-11-28 0:48 ` Wu Fengguang
2009-02-12 18:35 ` Vladislav Bolkhovitin
2009-02-13 1:57 ` Wu Fengguang
2009-02-13 20:08 ` Vladislav Bolkhovitin
2009-02-16 2:34 ` Wu Fengguang
2009-02-17 19:03 ` Vladislav Bolkhovitin
2009-02-18 18:14 ` Vladislav Bolkhovitin
2009-02-19 1:35 ` Wu Fengguang
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-19 2:05 ` Wu Fengguang
2009-03-19 17:44 ` Vladislav Bolkhovitin
2009-03-20 8:53 ` Vladislav Bolkhovitin
2009-03-23 1:42 ` Wu Fengguang
2009-04-21 18:18 ` Vladislav Bolkhovitin
2009-04-24 8:43 ` Wu Fengguang
2009-05-12 18:13 ` Vladislav Bolkhovitin
2009-02-17 19:01 ` Vladislav Bolkhovitin
2009-02-19 1:38 ` Wu Fengguang
2008-11-24 15:33 ` Jeff Moyer
2008-11-24 18:13 ` Jens Axboe
2008-11-24 18:50 ` Jeff Moyer
2008-11-24 18:51 ` Jens Axboe
2008-11-13 6:54 ` Vitaly V. Bursov
2008-11-13 14:32 ` Jeff Moyer
2008-11-13 18:33 ` Vitaly V. Bursov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=492BDB55.3050407@vlnb.net \
--to=vst@vlnb.net \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=vitalyb@telenet.dn.ua \
--cc=wfg@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox