From: Andi Kleen <andi@firstfloor.org>
To: kosaki.motohiro@jp.fujitsu.com, hugh.dickins@tiscali.co.uk,
nishimura@mxp.nes.nec.co.jp, balbir@linux.vnet.ibm.com,
kamezawa.hiroyu@jp.fujitsu.com, lizf@cn.fujitsu.com,
menage@google.com, npiggin@suse.de, andi@firstfloor.org,
fengguang.wu@intel.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: [PATCH] [23/31] HWPOISON: add memory cgroup filter
Date: Tue, 8 Dec 2009 22:16:39 +0100 (CET) [thread overview]
Message-ID: <20091208211639.8499FB151F@basil.firstfloor.org> (raw)
In-Reply-To: <200912081016.198135742@firstfloor.org>
The hwpoison test suite need to inject hwpoison to a collection of
selected task pages, and must not touch pages not owned by them and
thus kill important system processes such as init. (But it's OK to
mis-hwpoison free/unowned pages as well as shared clean pages.
Mis-hwpoison of shared dirty pages will kill all tasks, so the test
suite will target all or non of such tasks in the first place.)
The memory cgroup serves this purpose well. We can put the target
processes under the control of a memory cgroup, and tell the hwpoison
injection code to only kill pages associated with some active memory
cgroup.
The prerequisite for doing hwpoison stress tests with mem_cgroup is,
the mem_cgroup code tracks task pages _accurately_ (unless page is
locked). Which we believe is/should be true.
The benefits are simplification of hwpoison injector code. Also the
mem_cgroup code will automatically be tested by hwpoison test cases.
The alternative interfaces pin-pfn/unpin-pfn can also delegate the
(process and page flags) filtering functions reliably to user space.
However prototype implementation shows that this scheme adds more
complexity than we wanted.
Example test case:
mkdir /cgroup/hwpoison
usemem -m 100 -s 1000 &
echo `jobs -p` > /cgroup/hwpoison/tasks
memcg_ino=$(ls -id /cgroup/hwpoison | cut -f1 -d' ')
echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg
page-types -p `pidof init` --hwpoison # shall do nothing
page-types -p `pidof usemem` --hwpoison # poison its pages
AK: Fix documentation
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Paul Menage <menage@google.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
Documentation/vm/hwpoison.txt | 16 ++++++++++++++++
mm/hwpoison-inject.c | 7 +++++++
mm/internal.h | 1 +
mm/memory-failure.c | 42 ++++++++++++++++++++++++++++++++++++++++++
4 files changed, 66 insertions(+)
Index: linux/mm/memory-failure.c
===================================================================
--- linux.orig/mm/memory-failure.c
+++ linux/mm/memory-failure.c
@@ -100,6 +100,45 @@ static int hwpoison_filter_flags(struct
return -EINVAL;
}
+/*
+ * This allows stress tests to limit test scope to a collection of tasks
+ * by putting them under some memcg. This prevents killing unrelated/important
+ * processes such as /sbin/init. Note that the target task may share clean
+ * pages with init (eg. libc text), which is harmless. If the target task
+ * share _dirty_ pages with another task B, the test scheme must make sure B
+ * is also included in the memcg. At last, due to race conditions this filter
+ * can only guarantee that the page either belongs to the memcg tasks, or is
+ * a freed page.
+ */
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
+u64 hwpoison_filter_memcg;
+EXPORT_SYMBOL_GPL(hwpoison_filter_memcg);
+static int hwpoison_filter_task(struct page *p)
+{
+ struct mem_cgroup *mem;
+ struct cgroup_subsys_state *css;
+ unsigned long ino;
+
+ if (!hwpoison_filter_memcg)
+ return 0;
+
+ mem = try_get_mem_cgroup_from_page(p);
+ if (!mem)
+ return -EINVAL;
+
+ css = mem_cgroup_css(mem);
+ ino = css->cgroup->dentry->d_inode->i_ino;
+ css_put(css);
+
+ if (ino != hwpoison_filter_memcg)
+ return -EINVAL;
+
+ return 0;
+}
+#else
+static int hwpoison_filter_task(struct page *p) { return 0; }
+#endif
+
int hwpoison_filter(struct page *p)
{
if (hwpoison_filter_dev(p))
@@ -108,6 +147,9 @@ int hwpoison_filter(struct page *p)
if (hwpoison_filter_flags(p))
return -EINVAL;
+ if (hwpoison_filter_task(p))
+ return -EINVAL;
+
return 0;
}
EXPORT_SYMBOL_GPL(hwpoison_filter);
Index: linux/mm/internal.h
===================================================================
--- linux.orig/mm/internal.h
+++ linux/mm/internal.h
@@ -270,3 +270,4 @@ extern u32 hwpoison_filter_dev_major;
extern u32 hwpoison_filter_dev_minor;
extern u64 hwpoison_filter_flags_mask;
extern u64 hwpoison_filter_flags_value;
+extern u64 hwpoison_filter_memcg;
Index: linux/mm/hwpoison-inject.c
===================================================================
--- linux.orig/mm/hwpoison-inject.c
+++ linux/mm/hwpoison-inject.c
@@ -112,6 +112,13 @@ static int pfn_inject_init(void)
if (!dentry)
goto fail;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
+ dentry = debugfs_create_u64("corrupt-filter-memcg", 0600,
+ hwpoison_dir, &hwpoison_filter_memcg);
+ if (!dentry)
+ goto fail;
+#endif
+
return 0;
fail:
pfn_inject_exit();
Index: linux/Documentation/vm/hwpoison.txt
===================================================================
--- linux.orig/Documentation/vm/hwpoison.txt
+++ linux/Documentation/vm/hwpoison.txt
@@ -123,6 +123,22 @@ Only handle memory failures to pages ass
by block device major/minor. -1U is the wildcard value.
This should be only used for testing with artificial injection.
+corrupt-filter-memcg
+
+Limit injection to pages owned by memgroup. Specified by inode number
+of the memcg.
+
+Example:
+ mkdir /cgroup/hwpoison
+
+ usemem -m 100 -s 1000 &
+ echo `jobs -p` > /cgroup/hwpoison/tasks
+
+ memcg_ino=$(ls -id /cgroup/hwpoison | cut -f1 -d' ')
+ echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg
+
+ page-types -p `pidof init` --hwpoison # shall do nothing
+ page-types -p `pidof usemem` --hwpoison # poison its pages
corrupt-filter-flags-mask
corrupt-filter-flags-value
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-12-08 21:16 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-08 21:16 [PATCH] [0/31] HWPOISON 2.6.33 pre-merge posting Andi Kleen
2009-12-08 21:16 ` [PATCH] [1/31] HWPOISON: Add Andi Kleen as hwpoison maintainer to MAINTAINERS Andi Kleen
2009-12-08 21:16 ` [PATCH] [2/31] HWPOISON: Be more aggressive at freeing non LRU caches Andi Kleen
2009-12-08 21:16 ` [PATCH] [3/31] page-types: add standard GPL license header Andi Kleen
2009-12-08 21:16 ` [PATCH] [4/31] HWPOISON: remove the anonymous entry Andi Kleen
2009-12-08 21:16 ` [PATCH] [5/31] HWPOISON: return ENXIO on invalid page number Andi Kleen
2009-12-08 21:16 ` [PATCH] [6/31] HWPOISON: avoid grabbing the page count multiple times during madvise injection Andi Kleen
2009-12-08 21:16 ` [PATCH] [7/31] HWPOISON: Turn ref argument into flags argument Andi Kleen
2009-12-08 21:16 ` [PATCH] [8/31] HWPOISON: abort on failed unmap Andi Kleen
2009-12-08 21:16 ` [PATCH] [9/31] HWPOISON: comment the possible set_page_dirty() race Andi Kleen
2009-12-08 21:16 ` [PATCH] [10/31] HWPOISON: comment dirty swapcache pages Andi Kleen
2009-12-08 21:16 ` [PATCH] [11/31] HWPOISON: introduce delete_from_lru_cache() Andi Kleen
2009-12-08 21:16 ` [PATCH] [12/31] HWPOISON: remove the free buddy page handler Andi Kleen
2009-12-08 21:16 ` [PATCH] [13/31] HWPOISON: detect free buddy pages explicitly Andi Kleen
2009-12-08 21:16 ` [PATCH] [14/31] HWPOISON: Add unpoisoning support Andi Kleen
2009-12-08 21:16 ` [PATCH] [15/31] HWPOISON: make semantics of IGNORED/DELAYED clear Andi Kleen
2009-12-08 21:16 ` [PATCH] [16/31] HWPOISON: return 0 to indicate success reliably Andi Kleen
2009-12-08 21:16 ` [PATCH] [17/31] HWPOISON: add fs/device filters Andi Kleen
2009-12-08 21:16 ` [PATCH] [18/31] HWPOISON: limit hwpoison injector to known page types Andi Kleen
2009-12-08 21:16 ` [PATCH] [19/31] mm: export stable page flags Andi Kleen
2009-12-08 22:27 ` Matt Mackall
2009-12-09 2:00 ` Wu Fengguang
2009-12-09 21:38 ` Matt Mackall
2009-12-10 1:50 ` Andi Kleen
2009-12-10 2:09 ` Wu Fengguang
2009-12-10 13:42 ` Andi Kleen
2009-12-08 21:16 ` [PATCH] [20/31] HWPOISON: add page flags filter Andi Kleen
2009-12-08 21:16 ` [PATCH] [21/31] memcg: rename and export try_get_mem_cgroup_from_page() Andi Kleen
2009-12-08 21:16 ` [PATCH] [22/31] memcg: add accessor to mem_cgroup.css Andi Kleen
2009-12-08 21:16 ` Andi Kleen [this message]
2009-12-09 5:04 ` [PATCH] [23/31] HWPOISON: add memory cgroup filter Li Zefan
2009-12-09 5:06 ` KAMEZAWA Hiroyuki
2009-12-09 5:33 ` Balbir Singh
2009-12-09 9:15 ` Andi Kleen
2009-12-09 20:47 ` Paul Menage
2009-12-09 23:56 ` KAMEZAWA Hiroyuki
2009-12-10 1:42 ` Andi Kleen
2009-12-10 2:21 ` Balbir Singh
2009-12-11 2:14 ` Wu Fengguang
2009-12-14 12:53 ` Andi Kleen
2009-12-08 21:16 ` [PATCH] [24/31] HWPOISON: add an interface to switch off/on all the page filters Andi Kleen
2009-12-08 21:16 ` [PATCH] [25/31] HWPOISON: Don't do early filtering if filter is disabled Andi Kleen
2009-12-08 21:16 ` [PATCH] [26/31] HWPOISON: mention HWPoison in Kconfig entry Andi Kleen
2009-12-08 21:16 ` [PATCH] [27/31] HWPOISON: Use correct name for MADV_HWPOISON in documentation Andi Kleen
2009-12-08 21:16 ` [PATCH] [28/31] HWPOISON: Use new shake_page in memory_failure Andi Kleen
2009-12-08 21:16 ` [PATCH] [29/31] HWPOISON: Undefine short-hand macros after use to avoid namespace conflict Andi Kleen
2009-12-08 21:16 ` [PATCH] [30/31] HWPOISON: Add soft page offline support Andi Kleen
2009-12-08 21:16 ` [PATCH] [31/31] HWPOISON: Add a madvise() injector for soft page offlining Andi Kleen
2010-06-19 12:36 ` Michael Kerrisk
2010-06-19 13:20 ` Andi Kleen
2010-06-19 13:25 ` Michael Kerrisk
2010-06-19 13:30 ` Andi Kleen
2010-06-19 13:43 ` Michael Kerrisk
2010-06-19 14:09 ` Andi Kleen
2010-06-19 14:17 ` Michael Kerrisk
2010-06-19 19:52 ` Andi Kleen
2010-06-20 6:19 ` Michael Kerrisk
2010-06-20 7:14 ` Wu Fengguang
2010-06-26 13:18 ` Michael Kerrisk
2010-06-26 23:30 ` Wu Fengguang
2010-06-27 4:38 ` Michael Kerrisk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091208211639.8499FB151F@basil.firstfloor.org \
--to=andi@firstfloor.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=fengguang.wu@intel.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=menage@google.com \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).