From: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
To: linux-kernel@vger.kernel.org
Cc: akpm@linux-foundation.org, miklos@szeredi.hu,
yumiko.sugita.yf@hitachi.com, masami.hiramatsu.pt@hitachi.com,
hidehiro.kawai.ez@hitachi.com, yuji.kakutani.uw@hitachi.com,
soshima@redhat.com, haoki@redhat.com
Subject: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers
Date: Fri, 23 Feb 2007 21:03:37 +0900 [thread overview]
Message-ID: <45DED819.9040404@hitachi.com> (raw)
Hi,
I have observed a problem that write(2) can be blocked for a long time
if a system has several disks and is under heavy I/O pressure. This
patchset is to avoid the problem.
Example of the probrem:
There are two processes on a system which has two disks. Process-A
writes heavily to disk-a, and process-B writes small data (e.g. log
files) to disk-b occasionally. A portion of system memory, which is
depends on vm.dirty_ratio (typically 40%), is filled up with Dirty
and Writeback pages of disk-a.
In this situation, write(2) of process-B could be blocked for a very
long time (more then 60 seconds), although the load of disk-b is quite
low. In particular, the system would become quite slow, if disk-a is
slow (e.g. backup to an USB disk).
This seems to be the same problem as discussed in LKML:
http://marc.theaimsgroup.com/?t=115559902900003
and
http://marc.theaimsgroup.com/?t=117182340400003
Root cause:
I found this problem is caused by the balance_dirty_pages().
While Dirty+Writeback pages get more than 40% of memory, process-B is
blocked in balance_dirty_pages() until writeback of some (`write_chunk',
typically = 1536) dirty pages on disk-b is started.
However, because disk-b has only a few dirty pages, the process-B will
be blocked until writeback to disk-a is completed and Dirty+Writeback
goes below 40%.
Solution:
I consider that all of the dirty pages for the disk have been written
back and that the disk is clean if a process cannot write 'write_chunk'
pages in balance_dirty_pages().
To avoid using up the free memory with dirty pages by passing blocking,
this patchset adds a new threshold named vm.dirty_limit_ratio to sysctl.
It modifies balance_dirty_pages() not to block when the amount of
Dirty+Writeback is less than vm.dirty_limit_ratio percent of the memory.
In the other cases, writers are throttled as current Linux does.
In this patchset, vm.dirty_limit_ratio, instead of vm.dirty_ratio, is
used as the clamping level of Dirty+Writeback. And, vm.dirty_ratio is
used as the level at which a writers will itself start writeback of the
dirty pages.
Testing Results:
In the situation explained in "Example of the problem" section, I
measured time of write(2)ing to disk-b.
The write was completed by 30ms or less under the kernel with this
patchset.
When nr_requests is set too high (e.g. 8192), Dirty+Writeback grows near
vm.dirty_limit_ratio(45% of system memory by defaults). In that case,
write(2) sometimes took about 1 second.
This patchset can be applied to 2.6.20-mm2.
It consists of 3 pieces:
1/3 - add a sysctl variable `vm.dirty_limit_ratio'
2/3 - modify get_dirty_limits() to return the limit of dirty pages.
3/3 - break out of balance_dirty_pages() loop if the disk doesn't have
remaining dirty pages, if Dirty+Writeback < vm.dirty_limit_ratio.
--
Tomoki Sekiyama
Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: tomoki.sekiyama.qu@hitachi.com
next reply other threads:[~2007-02-23 12:07 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-23 12:03 Tomoki Sekiyama [this message]
2007-02-24 4:46 ` [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers KAMEZAWA Hiroyuki
2007-02-27 0:50 ` Tomoki Sekiyama
2007-02-27 1:39 ` KAMEZAWA Hiroyuki
2007-03-02 1:26 ` Tomoki Sekiyama
2007-02-24 13:15 ` Nikita Danilov
2007-02-27 0:52 ` Tomoki Sekiyama
2007-03-01 12:47 ` Leroy van Logchem
2007-03-02 9:16 ` Brice Figureau
2007-03-02 13:06 ` Leroy van Logchem
2007-03-02 16:04 ` Brice Figureau
2007-03-07 13:53 ` Yuji Kakutani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45DED819.9040404@hitachi.com \
--to=tomoki.sekiyama.qu@hitachi.com \
--cc=akpm@linux-foundation.org \
--cc=haoki@redhat.com \
--cc=hidehiro.kawai.ez@hitachi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=masami.hiramatsu.pt@hitachi.com \
--cc=miklos@szeredi.hu \
--cc=soshima@redhat.com \
--cc=yuji.kakutani.uw@hitachi.com \
--cc=yumiko.sugita.yf@hitachi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.