All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Vrabel <david.vrabel@citrix.com>
To: Mel Gorman <mgorman@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Xen-devel@lists.xen.org" <Xen-devel@lists.xen.org>
Subject: NUMA_BALANCING and Xen PV guest regression in 3.20-rc0
Date: Thu, 19 Feb 2015 13:06:53 +0000	[thread overview]
Message-ID: <54E5DFED.9050700@citrix.com> (raw)

Mel,

The NUMA_BALANCING series beginning with 5d833062139d (mm: numa: do not
dereference pmd outside of the lock during NUMA hinting fault) and
specifically 8a0516ed8b90 (mm: convert p[te|md]_numa users to
p[te|md]_protnone_numa) breaks Xen 64-bit PV guests.

Any fault on a present userspace mapping (e.g., a write to a read-only
mapping) is being misinterpreted as a NUMA hinting fault and not handled
correctly.  All userspace programs end up continuously  faulting.

This is because the hypervisor sets _PAGE_GLOBAL (== _PAGE_PROTNONE) on
all present userspace page table entries.

Note that the comment in asm/pgtable_types.h that says that
_PAGE_BIT_PROTNONE is only valid on non-present entries.

  /* If _PAGE_BIT_PRESENT is clear, we use these: */
  /* - if the user mapped it with PROT_NONE; pte_present gives true */
  #define _PAGE_BIT_PROTNONE	_PAGE_BIT_GLOBAL

Adjusting pte_protnone() and pmd_protnone() to check for the absence of
_PAGE_PRESENT allows 64-bit Xen PV guests to work correctly again (see
following patch), but I'm not sure if NUMA_BALANCING would correctly
work with this change.

David

8<---------------------------
x86: pte_protnone() and pmd_protnone() must check entry is
 not present

Since _PAGE_PROTNONE aliases _PAGE_GLOBAL it is only valid if
_PAGE_PRESENT is clear.  Make pte_protnone() and pmd_protnone() check
for this.

This fixes a 64-bit Xen PV guest regression introduced by
8a0516ed8b90c95ffa1363b420caa37418149f21 (mm: convert p[te|md]_numa
users to p[te|md]_protnone_numa).  Any userspace process would
endlessly fault.

In a 64-bit PV guest, userspace page table entries have _PAGE_GLOBAL
set by the hypervisor.  This meant that any fault on a present
userspace entry (e.g., a write to a read-only mapping) would be
misinterpreted as a NUMA hinting fault and the fault would not be
correctly handled, resulting in the access endlessly faulting.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: Mel Gorman <mgorman@suse.de>
---
 arch/x86/include/asm/pgtable.h |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 67fc3d2..a0c35bf 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -476,12 +476,14 @@ static inline int pmd_present(pmd_t pmd)
  */
 static inline int pte_protnone(pte_t pte)
 {
-	return pte_flags(pte) & _PAGE_PROTNONE;
+	return (pte_flags(pte) & (_PAGE_PROTNONE | _PAGE_PRESENT))
+		== _PAGE_PROTNONE;
 }

 static inline int pmd_protnone(pmd_t pmd)
 {
-	return pmd_flags(pmd) & _PAGE_PROTNONE;
+	return (pmd_flags(pmd) & (_PAGE_PROTNONE | _PAGE_PRESENT))
+		== _PAGE_PROTNONE;
 }
 #endif /* CONFIG_NUMA_BALANCING */

-- 
1.7.10.4

             reply	other threads:[~2015-02-19 13:06 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-19 13:06 David Vrabel [this message]
2015-02-19 17:01 ` NUMA_BALANCING and Xen PV guest regression in 3.20-rc0 Mel Gorman
2015-02-19 17:01 ` Mel Gorman
2015-02-23 15:13   ` [Xen-devel] " Dario Faggioli
2015-02-23 15:13     ` Dario Faggioli
2015-02-23 15:46     ` [Xen-devel] " Mel Gorman
2015-02-23 15:46     ` Mel Gorman
2015-02-19 23:09 ` Linus Torvalds
2015-02-20 10:28   ` [Xen-devel] " David Vrabel
2015-02-20 10:28   ` David Vrabel
2015-02-19 23:09 ` Linus Torvalds
2015-02-20  1:05 ` Kirill A. Shutemov
2015-02-20  1:49   ` Linus Torvalds
2015-02-20 10:47     ` Andrew Cooper
2015-02-20 10:47     ` [Xen-devel] " Andrew Cooper
2015-02-20 11:29       ` Kirill A. Shutemov
2015-02-20 11:54         ` Andrew Cooper
2015-02-20 11:54         ` [Xen-devel] " Andrew Cooper
2015-02-20 11:29       ` Kirill A. Shutemov
2015-02-20  1:49   ` Linus Torvalds
2015-02-20  1:05 ` Kirill A. Shutemov
  -- strict thread matches above, loose matches on Subject: below --
2015-02-19 13:06 David Vrabel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54E5DFED.9050700@citrix.com \
    --to=david.vrabel@citrix.com \
    --cc=Xen-devel@lists.xen.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.