All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Mark Rutland <mark.rutland@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Steve Capper <steve.capper@arm.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mel Gorman <mgorman@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.18 23/36] mm: numa: avoid waiting on freed migrated pages
Date: Mon,  3 Jul 2017 15:34:20 +0200	[thread overview]
Message-ID: <20170703133257.228051816@linuxfoundation.org> (raw)
In-Reply-To: <20170703133256.260692013@linuxfoundation.org>

3.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>

commit 3c226c637b69104f6b9f1c6ec5b08d7b741b3229 upstream.

In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry.  However,
we can race with migrate_misplaced_transhuge_page():

    // do_huge_pmd_numa_page                // migrate_misplaced_transhuge_page()
    // Holds 0 refs on page                 // Holds 2 refs on page

    vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
    /* ... */
    if (pmd_trans_migrating(*vmf->pmd)) {
            page = pmd_page(*vmf->pmd);
            spin_unlock(vmf->ptl);
                                            ptl = pmd_lock(mm, pmd);
                                            if (page_count(page) != 2)) {
                                                    /* roll back */
                                            }
                                            /* ... */
                                            mlock_migrate_page(new_page, page);
                                            /* ... */
                                            spin_unlock(ptl);
                                            put_page(page);
                                            put_page(page); // page freed here
            wait_on_page_locked(page);
            goto out;
    }

This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions.  This has been observed on arm64 KVM guests.

We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().

When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.

Fixes: b8916634b77bffb2 ("mm: Prevent parallel splits during THP migration")
Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 mm/huge_memory.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1294,8 +1294,12 @@ int do_huge_pmd_numa_page(struct mm_stru
 	 * check_same as the page may no longer be mapped.
 	 */
 	if (unlikely(pmd_trans_migrating(*pmdp))) {
+		page = pmd_page(*pmdp);
+		if (!get_page_unless_zero(page))
+			goto out_unlock;
 		spin_unlock(ptl);
 		wait_migrate_huge_page(vma->anon_vma, pmdp);
+		put_page(page);
 		goto out;
 	}
 
@@ -1331,8 +1335,11 @@ int do_huge_pmd_numa_page(struct mm_stru
 
 	/* Migration could have started since the pmd_trans_migrating check */
 	if (!page_locked) {
+		if (!get_page_unless_zero(page))
+			goto out_unlock;
 		spin_unlock(ptl);
 		wait_on_page_locked(page);
+		put_page(page);
 		page_nid = -1;
 		goto out;
 	}

  parent reply	other threads:[~2017-07-03 13:35 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-03 13:33 [PATCH 3.18 00/36] 3.18.60-stable review Greg Kroah-Hartman
2017-07-03 13:33 ` [PATCH 3.18 01/36] xhci: fix deadlock at host remove by running watchdog correctly Greg Kroah-Hartman
2017-07-03 13:33 ` [PATCH 3.18 02/36] ipv6: release dst on error in ip6_dst_lookup_tail Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 03/36] netfilter: xt_TCPMSS: add more sanity tests on tcph->doff Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 04/36] netfilter: synproxy: fix conntrackd interaction Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 05/36] net: dont call strlen on non-terminated string in dev_set_alias() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 06/36] decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 07/36] Fix an intermittent pr_emerg warning about lo becoming free Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 08/36] net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 09/36] igmp: acquire pmc lock for ip_mc_clear_src() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 10/36] igmp: add a missing spin_lock_init() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 11/36] ipv6: fix calling in6_ifa_hold incorrectly for dad work Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 12/36] decnet: always not take dst->__refcnt when inserting dst into hash table Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 13/36] net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 14/36] NFSv4: fix a reference leak caused WARNING messages Greg Kroah-Hartman
2017-07-03 14:33   ` Trond Myklebust
2017-07-03 15:02     ` gregkh
2017-07-03 15:02       ` gregkh
2017-07-03 13:34 ` [PATCH 3.18 15/36] arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 16/36] MIPS: Avoid accidental raw backtrace Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 17/36] MIPS: pm-cps: Drop manual cache-line alignment of ready_count Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 18/36] MIPS: Fix IRQ tracing & lockdep when rescheduling Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 19/36] ALSA: hda - set input_path bitmap to zero after moving it to new place Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 20/36] drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 21/36] usb: gadget: f_fs: Fix possibe deadlock Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 22/36] sysctl: enable strict writes Greg Kroah-Hartman
2017-07-03 13:34 ` Greg Kroah-Hartman [this message]
2017-07-03 13:34 ` [PATCH 3.18 25/36] net: korina: Fix NAPI versus resources freeing Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 27/36] xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 28/36] xfrm: NULL dereference on allocation failure Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 29/36] xfrm: Oops on error in pfkey_msg2xfrm_state() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 30/36] watchdog: bcm281xx: Fix use of uninitialized spinlock Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 31/36] ARM: 8685/1: ensure memblock-limit is pmd-aligned Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 32/36] iommu/vt-d: Dont over-free page table directories Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 33/36] iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 3.18 34/36] cpufreq: s3c2416: double free on driver init error path Greg Kroah-Hartman
2017-07-03 19:34 ` [PATCH 3.18 00/36] 3.18.60-stable review Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170703133257.228051816@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mgorman@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=steve.capper@arm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.