From: Wanpeng Li <liwanp@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>,
Fengguang Wu <fengguang.wu@intel.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Tony Luck <tony.luck@intel.com>,
gong.chen@linux.intel.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Wanpeng Li <liwanp@linux.vnet.ibm.com>
Subject: [PATCH v4 3/10] mm/hwpoison: fix race against poison thp
Date: Mon, 26 Aug 2013 16:46:07 +0800 [thread overview]
Message-ID: <1377506774-5377-3-git-send-email-liwanp@linux.vnet.ibm.com> (raw)
In-Reply-To: <1377506774-5377-1-git-send-email-liwanp@linux.vnet.ibm.com>
v1 -> v2:
* unpoison thp fail
There is a race between hwpoison page and unpoison page, memory_failure
set the page hwpoison and increase num_poisoned_pages without hold page
lock, and one page count will be accounted against thp for num_poisoned_pages.
However, unpoison can occur before memory_failure hold page lock and
split transparent hugepage, unpoison will decrease num_poisoned_pages
by 1 << compound_order since memory_failure has not yet split transparent
hugepage with page lock held. That means we account one page for hwpoison
and 1 << compound_order for unpoison. This patch fix it by inserting a
PageTransHuge check before doing TestClearPageHWPoison, unpoison failed
without clearing PageHWPoison and decreasing num_poisoned_pages.
A B
memory_failue
TestSetPageHWPoison(p);
if (PageHuge(p))
nr_pages = 1 << compound_order(hpage);
else
nr_pages = 1;
atomic_long_add(nr_pages, &num_poisoned_pages);
unpoison_memory
nr_pages = 1<< compound_trans_order(page);
if(TestClearPageHWPoison(p))
atomic_long_sub(nr_pages, &num_poisoned_pages);
lock page
if (!PageHWPoison(p))
unlock page and return
hwpoison_user_mappings
if (PageTransHuge(hpage))
split_huge_page(hpage);
Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
---
mm/memory-failure.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5a4f4d6..a6c4752 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1339,6 +1339,16 @@ int unpoison_memory(unsigned long pfn)
return 0;
}
+ /*
+ * unpoison_memory() can encounter thp only when the thp is being
+ * worked by memory_failure() and the page lock is not held yet.
+ * In such case, we yield to memory_failure() and make unpoison fail.
+ */
+ if (PageTransHuge(page)) {
+ pr_info("MCE: Memory failure is now running on %#lx\n", pfn);
+ return 0;
+ }
+
nr_pages = 1 << compound_order(page);
if (!get_page_unless_zero(page)) {
--
1.8.1.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-08-26 8:46 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-26 8:46 [PATCH v4 1/10] mm/hwpoison: fix lose PG_dirty flag for errors on mlocked pages Wanpeng Li
2013-08-26 8:46 ` [PATCH v4 2/10] mm/hwpoison: don't need to hold compound lock for hugetlbfs page Wanpeng Li
2013-08-26 8:46 ` Wanpeng Li [this message]
2013-08-26 8:46 ` [PATCH v4 4/10] mm/hwpoison: replacing atomic_long_sub() with atomic_long_dec() Wanpeng Li
2013-08-26 8:46 ` [PATCH v4 5/10] mm/hwpoison: don't set migration type twice to avoid hold heavy contend zone->lock Wanpeng Li
2013-08-26 8:46 ` [PATCH v4 6/10] mm/hwpoison: drop forward reference declarations __soft_offline_page() Wanpeng Li
2013-08-26 8:46 ` [PATCH v4 7/10] mm/hwpoison: add '#' to madvise_hwpoison Wanpeng Li
2013-08-26 8:46 ` [PATCH v4 8/10] mm/hwpoison: fix memory failure still hold reference count after unpoison empty zero page Wanpeng Li
2013-08-26 15:45 ` Naoya Horiguchi
2013-08-26 23:26 ` Wanpeng Li
2013-08-26 23:26 ` Wanpeng Li
2013-08-27 0:12 ` Naoya Horiguchi
2013-08-27 0:21 ` Wanpeng Li
2013-08-27 0:21 ` Wanpeng Li
[not found] ` <521bf0fc.4950320a.76ab.0f2dSMTPIN_ADDED_BROKEN@mx.google.com>
2013-08-27 0:46 ` Naoya Horiguchi
2013-08-27 1:17 ` Wanpeng Li
2013-08-27 1:17 ` Wanpeng Li
[not found] ` <521bfe37.83892b0a.1b94.2e7cSMTPIN_ADDED_BROKEN@mx.google.com>
2013-08-27 1:34 ` Naoya Horiguchi
2013-08-27 1:48 ` Wanpeng Li
2013-08-27 1:48 ` Wanpeng Li
[not found] ` <521be416.a5e8420a.6786.09d1SMTPIN_ADDED_BROKEN@mx.google.com>
2013-08-26 23:31 ` Andrew Morton
2013-08-26 23:40 ` Wanpeng Li
2013-08-26 23:40 ` Wanpeng Li
2013-08-26 8:46 ` [PATCH v4 9/10] mm/hwpoison: change permission of corrupt-pfn/unpoison-pfn to 0400 Wanpeng Li
2013-08-26 9:08 ` Wanpeng Li
2013-08-26 9:08 ` Wanpeng Li
2013-08-26 15:47 ` Naoya Horiguchi
2013-08-26 8:46 ` [PATCH v4 10/10] mm/hwpoison: fix bug triggered by unpoison empty zero page Wanpeng Li
2013-08-29 6:00 ` [PATCH v4 1/10] mm/hwpoison: fix lose PG_dirty flag for errors on mlocked pages Andi Kleen
2013-08-29 6:17 ` Wanpeng Li
2013-08-29 6:17 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1377506774-5377-3-git-send-email-liwanp@linux.vnet.ibm.com \
--to=liwanp@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=fengguang.wu@intel.com \
--cc=gong.chen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).