All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Xishi Qiu <qiuxishi@huawei.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: [PATCH 3.10 26/26] mm: fix process accidentally killed by mce because of huge page migration
Date: Tue, 18 Feb 2014 14:47:22 -0800	[thread overview]
Message-ID: <20140218224531.147324861@linuxfoundation.org> (raw)
In-Reply-To: <20140218224530.398913499@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Xishi Qiu <qiuxishi@huawei.com>

Based on c8721bbbdd36382de51cd6b7a56322e0acca2414 upstream, but only the
bugfix portion pulled out.

Hi Naoya or Greg,

We found a bug in 3.10.x.
The problem is that we accidentally have a hwpoisoned hugepage in free
hugepage list. It could happend in the the following scenario:

        process A                           process B

  migrate_huge_page
  put_page (old hugepage)
    linked to free hugepage list
                                     hugetlb_fault
                                       hugetlb_no_page
                                         alloc_huge_page
                                           dequeue_huge_page_vma
                                             dequeue_huge_page_node
                                               (steal hwpoisoned hugepage)
  set_page_hwpoison_huge_page
  dequeue_hwpoisoned_huge_page
    (fail to dequeue)

I tested this bug, one process keeps allocating huge page, and I 
use sysfs interface to soft offline a huge page, then received:
"MCE: Killing UCP:2717 due to hardware memory corruption fault at 8200034"

Upstream kernel is free from this bug because of these two commits:

f15bdfa802bfa5eb6b4b5a241b97ec9fa1204a35
mm/memory-failure.c: fix memory leak in successful soft offlining

c8721bbbdd36382de51cd6b7a56322e0acca2414
mm: memory-hotplug: enable memory hotplug to handle hugepage

The first one, although the problem is about memory leak, this patch
moves unset_migratetype_isolate(), which is important to avoid the race.
The latter is not a bug fix and it's too big, so I rewrite a small one.

The following patch can fix this bug.(please apply f15bdfa802bf first)

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/hugetlb.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -21,6 +21,7 @@
 #include <linux/rmap.h>
 #include <linux/swap.h>
 #include <linux/swapops.h>
+#include <linux/page-isolation.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -517,9 +518,15 @@ static struct page *dequeue_huge_page_no
 {
 	struct page *page;
 
-	if (list_empty(&h->hugepage_freelists[nid]))
+	list_for_each_entry(page, &h->hugepage_freelists[nid], lru)
+		if (!is_migrate_isolate_page(page))
+			break;
+	/*
+	 * if 'non-isolated free hugepage' not found on the list,
+	 * the allocation fails.
+	 */
+	if (&h->hugepage_freelists[nid] == &page->lru)
 		return NULL;
-	page = list_entry(h->hugepage_freelists[nid].next, struct page, lru);
 	list_move(&page->lru, &h->hugepage_activelist);
 	set_page_refcounted(page);
 	h->free_huge_pages--;



  parent reply	other threads:[~2014-02-18 22:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-18 22:46 [PATCH 3.10 00/26] 3.10.31-stable review Greg Kroah-Hartman
2014-02-18 22:46 ` [PATCH 3.10 01/26] SELinux: Fix kernel BUG on empty security contexts Greg Kroah-Hartman
2014-02-18 22:46 ` [PATCH 3.10 02/26] Btrfs: disable snapshot aware defrag for now Greg Kroah-Hartman
2014-02-18 22:46 ` [PATCH 3.10 03/26] crypto: s390 - fix concurrency issue in aes-ctr mode Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 04/26] crypto: s390 - fix des and des3_ede cbc concurrency issue Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 05/26] crypto: s390 - fix des and des3_ede ctr " Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 06/26] irqchip: armada-370-xp: fix IPI race condition Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 07/26] arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 08/26] arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 09/26] arm64: Invalidate the TLB when replacing pmd entries during boot Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 10/26] arm64: vdso: fix coarse clock handling Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 11/26] arm64: add DSB after icache flush in __flush_icache_all() Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 12/26] ALSA: usb-audio: Add missing kconfig dependecy Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 13/26] ALSA: hda - Fix missing VREF setup for Mac Pro 1,1 Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 14/26] ALSA: hda - Add missing mixer widget for AD1983 Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 15/26] mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq() Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 16/26] mm: __set_page_dirty uses spin_lock_irqsave instead of spin_lock_irq Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 17/26] x86: mm: change tlb_flushall_shift for IvyBridge Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 18/26] [media] af9035: add ID [2040:f900] Hauppauge WinTV-MiniStick 2 Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 20/26] x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 21/26] pinctrl: vt8500: Change devicetree data parsing Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 22/26] pinctrl: protect pinctrl_list add Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 23/26] mm/memory-failure.c: fix memory leak in successful soft offlining Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 24/26] IB/qib: Convert qib_user_sdma_pin_pages() to use get_user_pages_fast() Greg Kroah-Hartman
2014-02-18 22:47 ` [PATCH 3.10 25/26] intel_pstate: Take core C0 time into account for core busy calculation Greg Kroah-Hartman
2014-02-18 22:47 ` Greg Kroah-Hartman [this message]
2014-02-19  4:27 ` [PATCH 3.10 00/26] 3.10.31-stable review Guenter Roeck
2014-02-20  0:29 ` Shuah Khan
2014-02-20  7:30   ` Xishi Qiu
2014-02-20 13:39     ` Shuah Khan
2014-02-20 17:01       ` Shuah Khan
2014-02-20 17:12         ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140218224531.147324861@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=qiuxishi@huawei.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.