From: Sasha Levin <sasha.levin@oracle.com>
To: stable@vger.kernel.org, stable-commits@vger.kernel.org
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Andi Kleen <andi@firstfloor.org>,
Dean Nelson <dnelson@redhat.com>, Tony Luck <tony.luck@intel.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Hugh Dickins <hughd@google.com>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Sasha Levin <sasha.levin@oracle.com>
Subject: [added to the 4.1 stable tree] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
Date: Thu, 19 May 2016 00:19:28 -0400 [thread overview]
Message-ID: <1463631606-32540-29-git-send-email-sasha.levin@oracle.com> (raw)
In-Reply-To: <1463631606-32540-1-git-send-email-sasha.levin@oracle.com>
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
This patch has been added to the 4.1 stable tree. If you have any
objections, please let us know.
===============
[ Upstream commit f4c18e6f7b5bbb5b528b3334115806b0d76f50f9 ]
The race condition addressed in commit add05cecef80 ("mm: soft-offline:
don't free target page in successful page migration") was not closed
completely, because that can happen not only for soft-offline, but also
for hard-offline. Consider that a slab page is about to be freed into
buddy pool, and then an uncorrected memory error hits the page just
after entering __free_one_page(), then VM_BUG_ON_PAGE(page->flags &
PAGE_FLAGS_CHECK_AT_PREP) is triggered, despite the fact that it's not
necessary because the data on the affected page is not consumed.
To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time. I think it's justified because __PG_HWPOISON
flags is defined to prevent the page from being reused, and setting it
outside the page's alloc-free cycle is a designed behavior (not a bug.)
For recent months, I was annoyed about BUG_ON when soft-offlined page
remains on lru cache list for a while, which is avoided by calling
put_page() instead of putback_lru_page() in page migration's success
path. This means that this patch reverts a major change from commit
add05cecef80 about the new refcounting rule of soft-offlined pages, so
"reuse window" revives. This will be closed by a subsequent patch.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dean Nelson <dnelson@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
---
include/linux/page-flags.h | 10 +++++++---
mm/huge_memory.c | 7 +------
mm/migrate.c | 5 ++++-
mm/page_alloc.c | 4 ++++
4 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index f34e040..41c9384 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -631,15 +631,19 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
1 << PG_private | 1 << PG_private_2 | \
1 << PG_writeback | 1 << PG_reserved | \
1 << PG_slab | 1 << PG_swapcache | 1 << PG_active | \
- 1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \
+ 1 << PG_unevictable | __PG_MLOCKED | \
__PG_COMPOUND_LOCK)
/*
* Flags checked when a page is prepped for return by the page allocator.
- * Pages being prepped should not have any flags set. It they are set,
+ * Pages being prepped should not have these flags set. It they are set,
* there has been a kernel bug or struct page corruption.
+ *
+ * __PG_HWPOISON is exceptional because it needs to be kept beyond page's
+ * alloc-free cycle to prevent from reusing the page.
*/
-#define PAGE_FLAGS_CHECK_AT_PREP ((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_CHECK_AT_PREP \
+ (((1 << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON)
#define PAGE_FLAGS_PRIVATE \
(1 << PG_private | 1 << PG_private_2)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b807a8f..52975eb 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1676,12 +1676,7 @@ static void __split_huge_page_refcount(struct page *page,
/* after clearing PageTail the gup refcount can be released */
smp_mb__after_atomic();
- /*
- * retain hwpoison flag of the poisoned tail page:
- * fix for the unsuitable process killed on Guest Machine(KVM)
- * by the memory-failure.
- */
- page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP | __PG_HWPOISON;
+ page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
page_tail->flags |= (page->flags &
((1L << PG_referenced) |
(1L << PG_swapbacked) |
diff --git a/mm/migrate.c b/mm/migrate.c
index 1d42561..fe71f91 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -950,7 +950,10 @@ out:
list_del(&page->lru);
dec_zone_page_state(page, NR_ISOLATED_ANON +
page_is_file_cache(page));
- if (reason != MR_MEMORY_FAILURE)
+ /* Soft-offlined page shouldn't go through lru cache list */
+ if (reason == MR_MEMORY_FAILURE)
+ put_page(page);
+ else
putback_lru_page(page);
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 872b2ac..5519230 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -962,6 +962,10 @@ static inline int check_new_page(struct page *page)
bad_reason = "non-NULL mapping";
if (unlikely(atomic_read(&page->_count) != 0))
bad_reason = "nonzero _count";
+ if (unlikely(page->flags & __PG_HWPOISON)) {
+ bad_reason = "HWPoisoned (hardware-corrupted)";
+ bad_flags = __PG_HWPOISON;
+ }
if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) {
bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
--
2.5.0
next prev parent reply other threads:[~2016-05-19 4:21 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-19 4:19 [added to the 4.1 stable tree] Revert "usb: hub: do not clear BOS field during reset device" Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] stable: remove artifact created on backport Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] iwlwifi: pcie: lower the debug level for RSA semaphore access Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ASoC: rt5640: Correct the digital interface data select Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] regulator: s2mps11: Fix invalid selector mask and voltages for buck9 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] libahci: save port map for forced port map Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ata: ahci-platform: Add ports-implemented DT bindings Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] iio: ak8975: Fix NULL pointer exception on early interrupt Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] efi: Fix out-of-bounds read in variable_matches() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] USB: serial: cp210x: add ID for Link ECU Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] USB: serial: cp210x: add Straizona Focusers device ids Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] [media] v4l2-dv-timings.h: fix polarity for 4k formats Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] MD: make bio mergeable Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Add dock support for ThinkPad X260 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] workqueue: fix ghost PENDING flag while doing MQ IO Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/dp/mst: Restore primary hub guid on resume Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] cxl: Keep IRQ mappings on context teardown Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/radeon: fix vertical bars appear on monitor (v2) Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] IB/security: Restrict use of the write() interface Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm: vmscan: reclaim highmem zone if buffer_heads is over limit Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm: soft-offline: don't free target page in successful page migration Sasha Levin
2016-05-19 4:19 ` Sasha Levin [this message]
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] atomic_open(): fix the handling of create_error Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Drivers: hv: ring_buffer.c: fix comment style Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Drivers: hv_vmbus: Fix signal to host condition Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] powerpc: Fix bad inline asm constraint in create_zero_mask() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Minimal fix-up of bad hashing behavior of hash_64() Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] tracing: Don't display trigger file for events that can't be enabled Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/radeon: make sure vertical front porch is at least 1 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] MAINTAINERS: Remove asterisk from EFI directory names Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ACPICA: Dispatcher: Update thread ID for recursive method calls Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] crypto: hash - Fix page length clamping in hash walk Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] x86/sysfb_efi: Fix valid BAR address range check Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] fs/pnode.c: treat zero mnt_group_id-s as unequal Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] propogate_mnt: Handle the first propogated copy being a slave Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/radeon: fix DP link training issue with second 4K monitor Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] mm, cma: prevent nr_isolated_* counters from going negative Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] get_rock_ridge_filename(): handle malformed NM entries Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Apply fix for white noise on Asus N550JV, too Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix white noise on Asus UX501VW headset Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] Input: max8997-haptic - fix NULL pointer dereference Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] drm/i915: Bail out of pipe config compute loop on LPT Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix broken reconfig Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Asus N750JV external subwoofer fixup Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix white noise on Asus N750JV headphone Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] ALSA: usb-audio: Yet another Phoneix Audio device quirk Sasha Levin
2016-05-19 4:19 ` [added to the 4.1 stable tree] perf/core: Disable the event on a truncated AUX record Sasha Levin
2016-05-23 6:59 ` Alexander Shishkin
2016-05-30 21:50 ` Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] tools lib traceevent: Do not reassign parg after collapse_tree() Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] workqueue: fix rebind bound workers warning Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] drm/radeon: fix DP mode validation Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] ocfs2: fix SGID not inherited issue Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] ocfs2: fix posix_acl_create deadlock Sasha Levin
2016-05-19 4:20 ` [added to the 4.1 stable tree] nf_conntrack: avoid kernel pointer value leak in slab name Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1463631606-32540-29-git-send-email-sasha.levin@oracle.com \
--to=sasha.levin@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=dnelson@redhat.com \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=rientjes@google.com \
--cc=stable-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox