public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
To: akpm@linux-foundation.org, linux-kernel@vger.kernel.org
Cc: yumiko.sugita.yf@hitachi.com, masami.hiramatsu.pt@hitachi.com,
	hidehiro.kawai.ez@hitachi.com, yuji.kakutani.uw@hitachi.com,
	soshima@redhat.com, haoki@redhat.com,
	kamezawa.hiroyu@jp.fujitsu.com, nikita@clusterfs.com,
	leroy.vanlogchem@wldelft.nl
Subject: [PATCH 0/3] VM throttling: avoid blocking occasional writers
Date: Wed, 14 Mar 2007 21:42:46 +0900	[thread overview]
Message-ID: <45F7EDC6.5090303@hitachi.com> (raw)

Hi,

I ported the patch sent before to 2.6.21-rc3-mm2, so I'm resending it.
( Previous patch is available at
  http://marc.info/?l=linux-kernel&m=117223267512340&w=2 )


-Summary:
I have observed a problem that write(2) can be blocked for a long time
if a system has several disks and is under heavy I/O pressure. This
patchset is to avoid the problem by introducing high/low water-mark
algorithm to balance_dirty_pages() function.

-Example of the probrem:

There are two processes on a system which has two disks. Process-A
writes heavily to disk-a, and process-B writes small data (e.g. log
files) to disk-b occasionally. A portion of system memory, which is
depends on vm.dirty_ratio (typically 40%), is filled up with Dirty
and Writeback pages of disk-a.

In this situation, write(2) of process-B could be blocked for a very
long time (more then 60 seconds), although the load of disk-b is quite
low. In particular, the system would become quite slow, if disk-a is
slow (e.g. backup to an USB disk).

This seems to be the same problem as discussed in LKML:
http://marc.theaimsgroup.com/?t=115559902900003
and
http://marc.theaimsgroup.com/?t=117182340400003

-Cause:

I found this problem is caused by the balance_dirty_pages().

While Dirty+Writeback pages get more than 40% of memory, process-B is
blocked in balance_dirty_pages() until writeback of some (`write_chunk',
typically = 1536) dirty pages on disk-b is started.

However, because disk-b has only a few dirty pages, the process-B will
be blocked until writeback to disk-a is completed and Dirty+Writeback
goes below 40%.

-Solution:

I consider that all of the dirty pages for the disk have been written
back and that the disk is clean if a process cannot write 'write_chunk'
pages in balance_dirty_pages().

To avoid using up the free memory with dirty pages by passing blocking,
this patchset adds a new threshold named vm.dirty_limit_ratio to sysctl.

It modifies balance_dirty_pages() not to block when the amount of
Dirty+Writeback is less than vm.dirty_limit_ratio percent of the memory.
In the other cases, writers are throttled as current Linux does.


In this patchset, vm.dirty_limit_ratio, instead of vm.dirty_ratio, is
used as the clamping level of Dirty+Writeback. And, vm.dirty_ratio is
used as the level at which a writers will itself start writeback of the
dirty pages.


-Testing Results:

In the situation explained in "Example of the problem" section, I
measured time of write(2)ing to disk-b.
The write was completed by 30ms or less under the kernel with this
patchset.

When nr_requests is set too high (e.g. 8192), Dirty+Writeback grows near
vm.dirty_limit_ratio(45% of system memory by defaults). In that case,
write(2) sometimes took about 1 second.


This patchset can be applied to 2.6.21-rc3-mm2.
It consists of 3 pieces:

1/3 - add a sysctl variable `vm.dirty_limit_ratio'
2/3 - modify get_dirty_limits() to return the limit of dirty pages.
3/3 - break out of balance_dirty_pages() loop if the disk doesn't have
      remaining dirty pages, if Dirty+Writeback < vm.dirty_limit_ratio.

-- 
Tomoki Sekiyama
Hitachi, Ltd., Systems Development Laboratory



             reply	other threads:[~2007-03-14 12:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-14 12:42 Tomoki Sekiyama [this message]
2007-03-14 13:18 ` [PATCH 0/3] VM throttling: avoid blocking occasional writers Peter Zijlstra
2007-03-15 19:07 ` Andrew Morton
2007-03-18 14:59   ` Bill Davidsen
2007-03-22  5:49     ` Tomoki Sekiyama
2007-03-22 11:41       ` Bill Davidsen
2007-03-26 10:27         ` Tomoki Sekiyama
2007-03-26 17:11           ` Bill Davidsen
2007-04-03 10:42             ` Tomoki Sekiyama
2007-04-03 10:46             ` [PATCH 1/2] VM throttling: Start writeback at dirty_writeback_start_ratio Tomoki Sekiyama
2007-04-06  0:31               ` Andrew Morton
2007-04-10  3:04                 ` Tomoki Sekiyama
2007-04-10  3:46                   ` Andrew Morton
2007-04-03 10:47             ` [PATCH 2/2] VM throttling: Add vm.dirty_start_writeback_ratio to sysctl Tomoki Sekiyama

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45F7EDC6.5090303@hitachi.com \
    --to=tomoki.sekiyama.qu@hitachi.com \
    --cc=akpm@linux-foundation.org \
    --cc=haoki@redhat.com \
    --cc=hidehiro.kawai.ez@hitachi.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=leroy.vanlogchem@wldelft.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=nikita@clusterfs.com \
    --cc=soshima@redhat.com \
    --cc=yuji.kakutani.uw@hitachi.com \
    --cc=yumiko.sugita.yf@hitachi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox