From: <gregkh@linuxfoundation.org>
To: kirill.shutemov@linux.intel.com, aarcange@redhat.com,
akpm@linux-foundation.org, gregkh@linuxfoundation.org,
hillf.zj@alibaba-inc.com, jinpu.wang@profitbricks.com,
stable@vger.kernel.org, torvalds@linux-foundation.org
Cc: <stable@vger.kernel.org>, <stable-commits@vger.kernel.org>
Subject: Patch "thp: fix MADV_DONTNEED vs. numa balancing race" has been added to the 4.9-stable tree
Date: Tue, 12 Dec 2017 09:33:54 +0100 [thread overview]
Message-ID: <151306763469232@kroah.com> (raw)
This is a note to let you know that I've just added the patch titled
thp: fix MADV_DONTNEED vs. numa balancing race
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
thp-fix-madv_dontneed-vs.-numa-balancing-race.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.
>From ced108037c2aa542b3ed8b7afd1576064ad1362a Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Thu, 13 Apr 2017 14:56:20 -0700
Subject: thp: fix MADV_DONTNEED vs. numa balancing race
From: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
commit ced108037c2aa542b3ed8b7afd1576064ad1362a upstream.
In case prot_numa, we are under down_read(mmap_sem). It's critical to
not clear pmd intermittently to avoid race with MADV_DONTNEED which is
also under down_read(mmap_sem):
CPU0: CPU1:
change_huge_pmd(prot_numa=1)
pmdp_huge_get_and_clear_notify()
madvise_dontneed()
zap_pmd_range()
pmd_trans_huge(*pmd) == 0 (without ptl)
// skip the pmd
set_pmd_at();
// pmd is re-established
The race makes MADV_DONTNEED miss the huge pmd and don't clear it
which may break userspace.
Found by code analysis, never saw triggered.
Link: http://lkml.kernel.org/r/20170302151034.27829-3-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[jwang: adjust context for 4.9 ]
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/huge_memory.c | 34 +++++++++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1531,7 +1531,39 @@ int change_huge_pmd(struct vm_area_struc
if (prot_numa && pmd_protnone(*pmd))
goto unlock;
- entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd);
+ /*
+ * In case prot_numa, we are under down_read(mmap_sem). It's critical
+ * to not clear pmd intermittently to avoid race with MADV_DONTNEED
+ * which is also under down_read(mmap_sem):
+ *
+ * CPU0: CPU1:
+ * change_huge_pmd(prot_numa=1)
+ * pmdp_huge_get_and_clear_notify()
+ * madvise_dontneed()
+ * zap_pmd_range()
+ * pmd_trans_huge(*pmd) == 0 (without ptl)
+ * // skip the pmd
+ * set_pmd_at();
+ * // pmd is re-established
+ *
+ * The race makes MADV_DONTNEED miss the huge pmd and don't clear it
+ * which may break userspace.
+ *
+ * pmdp_invalidate() is required to make sure we don't miss
+ * dirty/young flags set by hardware.
+ */
+ entry = *pmd;
+ pmdp_invalidate(vma, addr, pmd);
+
+ /*
+ * Recover dirty/young flags. It relies on pmdp_invalidate to not
+ * corrupt them.
+ */
+ if (pmd_dirty(*pmd))
+ entry = pmd_mkdirty(entry);
+ if (pmd_young(*pmd))
+ entry = pmd_mkyoung(entry);
+
entry = pmd_modify(entry, newprot);
if (preserve_write)
entry = pmd_mkwrite(entry);
Patches currently in stable-queue which might be from kirill.shutemov@linux.intel.com are
queue-4.9/mm-drop-unused-pmdp_huge_get_and_clear_notify.patch
queue-4.9/thp-fix-madv_dontneed-vs.-numa-balancing-race.patch
queue-4.9/thp-reduce-indentation-level-in-change_huge_pmd.patch
reply other threads:[~2017-12-12 8:33 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=151306763469232@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hillf.zj@alibaba-inc.com \
--cc=jinpu.wang@profitbricks.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=stable-commits@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.