From: Ying Han <yinghan@google.com>
To: Balbir Singh <balbir@linux.vnet.ibm.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mel@csn.ul.ie>, Johannes Weiner <hannes@cmpxchg.org>,
Christoph Lameter <cl@linux.com>,
Wu Fengguang <fengguang.wu@intel.com>,
Andi Kleen <ak@linux.intel.com>, Hugh Dickins <hughd@google.com>,
Rik van Riel <riel@redhat.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
Subject: [PATCH 0/5] memcg: per cgroup background reclaim
Date: Thu, 13 Jan 2011 14:00:30 -0800 [thread overview]
Message-ID: <1294956035-12081-1-git-send-email-yinghan@google.com> (raw)
The current implementation of memcg only supports direct reclaim and this
patchset adds the support for background reclaim. Per cgroup background
reclaim is needed which spreads out the memory pressure over longer period
of time and smoothes out the system performance.
I run through some simple tests which reads/writes a large file and makes sure
it triggers per cgroup kswapd on the low_wmark. I compared at pg_steal/pg_scan
ratio w/o background reclaim. Also the running time is measured in this
patchset.
Step1: Create a cgroup with 500M memory_limit.
$ mount -t cgroup -o cpuset,memory cpuset /dev/cgroup
$ mkdir /dev/cgroup/A
$ echo 0 >/dev/cgroup/A/cpuset.cpus
$ echo 0 >/dev/cgroup/A/cpuset.mems
$ echo 500m >/dev/cgroup/A/memory.limit_in_bytes
$ echo $$ >/dev/cgroup/A/tasks
Step2: Check the wmarks.
$ cat /dev/cgroup/A/memory.reclaim_wmarks
low_wmark 3663360
high_wmark 4396032
Step3: Dirty the pages by creating a 20g file on hard drive.
$ ddtest -D /export/hdc3/dd -b 1024 -n 20971520 -t 1
Checked the memory.stat w/o background reclaim. It used to be all the pages are
reclaimed from direct reclaim, and now most of the pages are reclaimed at
background. (note: writing '0' to min_free_kbytes disables per cgroup kswapd)
Only direct reclaim With background reclaim:
pgpgin 5184374 pgpgin 5248437
pgpgout 5056385 pgpgout 5121659
kswapd_steal 0 kswapd_steal 5121516
pg_pgsteal 5056363 pg_pgsteal 32
kswapd_pgscan 0 kswapd_pgscan 5121569
pg_scan 5056416 pg_scan 32
pgrefill 297632 pgrefill 312512
pgoutrun 0 pgoutrun 107525
allocstall 158009 allocstall 1
real 21m6.864s real 24m56.735s
user 0m2.047s user 0m2.331s
sys 6m2.572s sys 7m29.048s
Step4: Cleanup
$ echo $$ >/dev/cgroup/tasks
$ echo 1 > /dev/cgroup/A/memory.force_empty
Step5: Read the 20g file into the pagecache.
$ cat /export/hdc3/dd/tf0 > /dev/zero;
Checked the memory.stat w/o background reclaim. Most of clean pages are
reclaimed at background instead of direct reclaim.
Only direct reclaim With background reclaim:
pgpgin 5184066 pgpgin 5184081
pgpgout 5056093 pgpgout 5057185
kswapd_steal 0 kswapd_steal 4960805
pg_pgsteal 5056063 pg_pgsteal 96348
kswapd_pgscan 0 kswapd_pgscan 4960809
pg_scan 5056064 pg_scan 96348
pgrefill 0 pgrefill 0
pgoutrun 0 pgoutrun 54904
allocstall 158001 allocstall 3010
real 3m13.034s real 3m13.074s
user 0m0.221s user 0m0.158s
sys 0m22.793s sys 0m38.603s
TODO:
1. Keep debugging the crash isse on NUMA machine.
2. Generate more test cases and look into reducing the lock contention.
Ying Han (5):
Add kswapd descriptor.
Add per cgroup reclaim watermarks.
New APIs to adjust per cgroup wmarks.
Per cgroup background reclaim.
Add more per memcg stats.
Documentation/cgroups/memory.txt | 14 ++
include/linux/memcontrol.h | 102 +++++++++
include/linux/mmzone.h | 3 +-
include/linux/res_counter.h | 83 ++++++++
include/linux/swap.h | 12 +-
kernel/res_counter.c | 6 +
mm/memcontrol.c | 390 +++++++++++++++++++++++++++++++++++-
mm/page_alloc.c | 1 -
mm/vmscan.c | 418 ++++++++++++++++++++++++++++++++++----
9 files changed, 979 insertions(+), 50 deletions(-)
--
1.7.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2011-01-13 22:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-13 22:00 Ying Han [this message]
2011-01-13 22:00 ` [PATCH 1/5] Add kswapd descriptor Ying Han
2011-01-13 22:00 ` [PATCH 2/5] Add per cgroup reclaim watermarks Ying Han
2011-01-14 0:11 ` KAMEZAWA Hiroyuki
2011-01-18 20:02 ` Ying Han
2011-01-18 20:36 ` David Rientjes
2011-01-18 21:10 ` Ying Han
2011-01-19 0:56 ` KAMEZAWA Hiroyuki
2011-01-19 2:38 ` David Rientjes
2011-01-19 2:47 ` KAMEZAWA Hiroyuki
2011-01-19 10:03 ` David Rientjes
2011-01-19 0:44 ` KAMEZAWA Hiroyuki
2011-01-13 22:00 ` [PATCH 3/5] New APIs to adjust per cgroup wmarks Ying Han
2011-01-13 22:00 ` [PATCH 4/5] Per cgroup background reclaim Ying Han
2011-01-14 0:52 ` KAMEZAWA Hiroyuki
2011-01-19 2:12 ` Ying Han
2011-01-13 22:00 ` [PATCH 5/5] Add more per memcg stats Ying Han
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1294956035-12081-1-git-send-email-yinghan@google.com \
--to=yinghan@google.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=cl@linux.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=riel@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).