From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753120AbbBSNG4 (ORCPT ); Thu, 19 Feb 2015 08:06:56 -0500 Received: from smtp02.citrix.com ([66.165.176.63]:32801 "EHLO SMTP02.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752775AbbBSNGz (ORCPT ); Thu, 19 Feb 2015 08:06:55 -0500 X-IronPort-AV: E=Sophos;i="5.09,608,1418083200"; d="scan'208";a="228896724" Message-ID: <54E5DFED.9050700@citrix.com> Date: Thu, 19 Feb 2015 13:06:53 +0000 From: David Vrabel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.4.0 MIME-Version: 1.0 To: Mel Gorman , Linus Torvalds CC: "linux-kernel@vger.kernel.org" , "Xen-devel@lists.xen.org" Subject: NUMA_BALANCING and Xen PV guest regression in 3.20-rc0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-DLP: MIA2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mel, The NUMA_BALANCING series beginning with 5d833062139d (mm: numa: do not dereference pmd outside of the lock during NUMA hinting fault) and specifically 8a0516ed8b90 (mm: convert p[te|md]_numa users to p[te|md]_protnone_numa) breaks Xen 64-bit PV guests. Any fault on a present userspace mapping (e.g., a write to a read-only mapping) is being misinterpreted as a NUMA hinting fault and not handled correctly. All userspace programs end up continuously faulting. This is because the hypervisor sets _PAGE_GLOBAL (== _PAGE_PROTNONE) on all present userspace page table entries. Note that the comment in asm/pgtable_types.h that says that _PAGE_BIT_PROTNONE is only valid on non-present entries. /* If _PAGE_BIT_PRESENT is clear, we use these: */ /* - if the user mapped it with PROT_NONE; pte_present gives true */ #define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL Adjusting pte_protnone() and pmd_protnone() to check for the absence of _PAGE_PRESENT allows 64-bit Xen PV guests to work correctly again (see following patch), but I'm not sure if NUMA_BALANCING would correctly work with this change. David 8<--------------------------- x86: pte_protnone() and pmd_protnone() must check entry is not present Since _PAGE_PROTNONE aliases _PAGE_GLOBAL it is only valid if _PAGE_PRESENT is clear. Make pte_protnone() and pmd_protnone() check for this. This fixes a 64-bit Xen PV guest regression introduced by 8a0516ed8b90c95ffa1363b420caa37418149f21 (mm: convert p[te|md]_numa users to p[te|md]_protnone_numa). Any userspace process would endlessly fault. In a 64-bit PV guest, userspace page table entries have _PAGE_GLOBAL set by the hypervisor. This meant that any fault on a present userspace entry (e.g., a write to a read-only mapping) would be misinterpreted as a NUMA hinting fault and the fault would not be correctly handled, resulting in the access endlessly faulting. Signed-off-by: David Vrabel Cc: Mel Gorman --- arch/x86/include/asm/pgtable.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 67fc3d2..a0c35bf 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -476,12 +476,14 @@ static inline int pmd_present(pmd_t pmd) */ static inline int pte_protnone(pte_t pte) { - return pte_flags(pte) & _PAGE_PROTNONE; + return (pte_flags(pte) & (_PAGE_PROTNONE | _PAGE_PRESENT)) + == _PAGE_PROTNONE; } static inline int pmd_protnone(pmd_t pmd) { - return pmd_flags(pmd) & _PAGE_PROTNONE; + return (pmd_flags(pmd) & (_PAGE_PROTNONE | _PAGE_PRESENT)) + == _PAGE_PROTNONE; } #endif /* CONFIG_NUMA_BALANCING */ -- 1.7.10.4