From: Andrew Morton <akpm@linux-foundation.org>
To: Jos Houtman <jos@hyves.nl>
Cc: <linux-kernel@vger.kernel.org>, Jens Axboe <jens.axboe@oracle.com>
Subject: Re: Page Cache writeback too slow, SSD/noop scheduler/ext2
Date: Sat, 21 Mar 2009 03:53:15 -0700 [thread overview]
Message-ID: <20090321035315.fc10cef6.akpm@linux-foundation.org> (raw)
In-Reply-To: <C5E99E4E.C0E0%jos@hyves.nl>
On Fri, 20 Mar 2009 19:26:06 +0100 Jos Houtman <jos@hyves.nl> wrote:
> Hi,
>
> We have hit a problem where the page-cache writeback algorithm is not
> keeping up.
> When memory gets low this will result in very irregular performance drops.
>
> Our setup is as follows:
> 30 x Quad core machine with 64GB ram.
> These are single purpose machines running MySQL.
> Kernel version: 2.6.28.7
> A dedicated SSD drive for the ext2 database partition
> Noop scheduler for the ssd drive.
>
>
> The current hypothesis is as follows:
> The wk_update function does not write enough dirty pages, which allows the
> number of dirty pages to grow to the dirty_background limit.
> When memory is low, __background_writeout() comes around and __forcefully__
> writes dirty pages to disk.
> This forced write fills the disk queue and starves read calls that MySQL is
> trying to do: basically killing performance for a few seconds.
> This pattern repeats as soon as the cleared memory is filled again.
>
> Decreasing the dirty_writeback_centisecs to 100 doesn__t help
>
> I don__t know why this is, but I did some preliminary tracing using systemtap
> and it seems that the majority of times wk_update calls decides to do
> nothing.
>
> Doubling /sys/block/sdb/queue/nr_requests to 256, seems to help abit: the
> nr_dirty pages is increasing more slowly.
> But I am unsure of side-effects and am afraid of increasing the starvation
> problem for mysql.
>
>
> I__am very much willing to work on this issue and see it fixed, but would
> like to tap into the knowledge of people here.
> So:
> * Have more people seen this or simular issues?
> * Is the hypothesis above a viable one?
> * Suggestions/pointers for further research and statistics I should measure
> to improve the understanding of this problem.
>
I don't think that noop-iosched tries to do anything to prevent
writes-starve-reads. Do you get better behaviour from any of the other IO
schedulers?
next prev parent reply other threads:[~2009-03-21 10:59 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <C5E989F1.C0C7%jos@hyves.nl>
2009-03-20 18:26 ` Page Cache writeback too slow, SSD/noop scheduler/ext2 Jos Houtman
2009-03-21 10:53 ` Andrew Morton [this message]
[not found] <C5EB16F4.C318%jos@hyves.nl>
2009-03-22 16:53 ` Jos Houtman
2009-03-24 14:48 ` Nick Piggin
2009-03-25 5:26 ` Wu Fengguang
2009-03-25 5:26 ` Wu Fengguang
2009-03-27 16:59 ` Jos Houtman
2009-03-29 2:32 ` Wu Fengguang
2009-03-29 2:32 ` Wu Fengguang
2009-03-30 16:47 ` Jos Houtman
2009-03-31 0:28 ` Wu Fengguang
2009-03-31 0:28 ` Wu Fengguang
2009-03-31 12:16 ` Jos Houtman
2009-03-31 12:31 ` Wu Fengguang
2009-03-31 12:31 ` Wu Fengguang
2009-03-31 14:10 ` Jos Houtman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090321035315.fc10cef6.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=jens.axboe@oracle.com \
--cc=jos@hyves.nl \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.