From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yk0-f176.google.com (mail-yk0-f176.google.com [209.85.160.176]) by kanga.kvack.org (Postfix) with ESMTP id 58FD06B0038 for ; Thu, 9 Jul 2015 13:04:52 -0400 (EDT) Received: by ykeo3 with SMTP id o3so121036643yke.0 for ; Thu, 09 Jul 2015 10:04:52 -0700 (PDT) Received: from g4t3427.houston.hp.com (g4t3427.houston.hp.com. [15.201.208.55]) by mx.google.com with ESMTPS id y184si3883262yky.22.2015.07.09.10.04.50 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Jul 2015 10:04:50 -0700 (PDT) From: Toshi Kani Subject: [PATCH 0/2] x86, mm: Fix PAT bit handling of large pages Date: Thu, 9 Jul 2015 11:03:49 -0600 Message-Id: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com, konrad.wilk@oracle.com, elliott@hp.com The PAT bit gets relocated to bit 12 when PUD and PMD mappings are used. This bit 12, however, is not covered by PTE_FLAGS_MASK, which is corrently used for masking the flag bits for all cases. Patch 1/2 fixes pud_flags() and pmd_flags() to handle the PAT bit when PUD and PMD mappings are used. Patch 2/2 fixes /sys/kernel/debug/kernel_page_tables to show the PAT bit properly. Note, the PAT bit is first enabled in 4.2-rc1 with WT mappings. --- Toshi Kani (2): 1/2 x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE 2/2 x86, mm: Fix page table dump to show PAT bit --- arch/x86/include/asm/pgtable_types.h | 16 ++++++++++++--- arch/x86/mm/dump_pagetables.c | 39 +++++++++++++++++++----------------- 2 files changed, 34 insertions(+), 21 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yk0-f169.google.com (mail-yk0-f169.google.com [209.85.160.169]) by kanga.kvack.org (Postfix) with ESMTP id 5AFF76B0038 for ; Thu, 9 Jul 2015 13:04:53 -0400 (EDT) Received: by ykee186 with SMTP id e186so41566462yke.2 for ; Thu, 09 Jul 2015 10:04:53 -0700 (PDT) Received: from g9t5009.houston.hp.com (g9t5009.houston.hp.com. [15.240.92.67]) by mx.google.com with ESMTPS id s190si4139818ywd.126.2015.07.09.10.04.50 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Jul 2015 10:04:51 -0700 (PDT) From: Toshi Kani Subject: [PATCH 1/2] x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE Date: Thu, 9 Jul 2015 11:03:50 -0600 Message-Id: <1436461431-27305-2-git-send-email-toshi.kani@hp.com> In-Reply-To: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com, konrad.wilk@oracle.com, elliott@hp.com, Toshi Kani The PAT bit gets relocated to bit 12 when PUD and PMD mappings are used. This bit 12, however, is not covered by PTE_FLAGS_MASK, which is corrently used for masking the flag bits for all cases. Fix pud_flags() and pmd_flags() to cover the PAT bit, _PAGE_PAT_LARGE, when they are used to map a large page with _PAGE_PSE set. Signed-off-by: Toshi Kani Cc: Juergen Gross Cc: Konrad Wilk Cc: Robert Elliott Cc: Thomas Gleixner Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton --- arch/x86/include/asm/pgtable_types.h | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 13f310b..caaf45c 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -212,9 +212,13 @@ enum page_cache_mode { /* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */ #define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK) -/* PTE_FLAGS_MASK extracts the flags from a (pte|pmd|pud|pgd)val_t */ +/* Extracts the flags from a (pte|pmd|pud|pgd)val_t of a 4KB page */ #define PTE_FLAGS_MASK (~PTE_PFN_MASK) +/* Extracts the flags from a (pmd|pud)val_t of a (1GB|2MB) page */ +#define PMD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) +#define PUD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) + typedef struct pgprot { pgprotval_t pgprot; } pgprot_t; typedef struct { pgdval_t pgd; } pgd_t; @@ -278,12 +282,18 @@ static inline pmdval_t native_pmd_val(pmd_t pmd) static inline pudval_t pud_flags(pud_t pud) { - return native_pud_val(pud) & PTE_FLAGS_MASK; + if (native_pud_val(pud) & _PAGE_PSE) + return native_pud_val(pud) & PUD_FLAGS_MASK_LARGE; + else + return native_pud_val(pud) & PTE_FLAGS_MASK; } static inline pmdval_t pmd_flags(pmd_t pmd) { - return native_pmd_val(pmd) & PTE_FLAGS_MASK; + if (native_pmd_val(pmd) & _PAGE_PSE) + return native_pmd_val(pmd) & PMD_FLAGS_MASK_LARGE; + else + return native_pmd_val(pmd) & PTE_FLAGS_MASK; } static inline pte_t native_make_pte(pteval_t val) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f52.google.com (mail-oi0-f52.google.com [209.85.218.52]) by kanga.kvack.org (Postfix) with ESMTP id 6E6296B0253 for ; Thu, 9 Jul 2015 13:04:55 -0400 (EDT) Received: by oihr66 with SMTP id r66so137889363oih.2 for ; Thu, 09 Jul 2015 10:04:55 -0700 (PDT) Received: from g9t5008.houston.hp.com (g9t5008.houston.hp.com. [15.240.92.66]) by mx.google.com with ESMTPS id p132si4671044oig.125.2015.07.09.10.04.52 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Jul 2015 10:04:52 -0700 (PDT) From: Toshi Kani Subject: [PATCH 2/2] x86, mm: Fix page table dump to show PAT bit Date: Thu, 9 Jul 2015 11:03:51 -0600 Message-Id: <1436461431-27305-3-git-send-email-toshi.kani@hp.com> In-Reply-To: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com, konrad.wilk@oracle.com, elliott@hp.com, Toshi Kani /sys/kernel/debug/kernel_page_tables does not show the PAT bit for PUD and PMD mappings. This is because walk_pud_level(), walk_pmd_level() and note_page() mask the flags with PTE_FLAGS_MASK, which does not cover their PAT bit, _PAGE_PAT_LARGE. Fix it by replacing the use of PTE_FLAGS_MASK with pXd_flags(), which mask the flags properly. Change also to show the PAT bit as "PAT" to be consistent with other bits. Reported-by: Robert Elliott Signed-off-by: Toshi Kani Cc: Juergen Gross Cc: Konrad Wilk Cc: Robert Elliott Cc: Thomas Gleixner Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton --- arch/x86/mm/dump_pagetables.c | 39 +++++++++++++++++++++------------------ 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c index f0cedf3..71ab2d7 100644 --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -155,7 +155,7 @@ static void printk_prot(struct seq_file *m, pgprot_t prot, int level, bool dmsg) pt_dump_cont_printf(m, dmsg, " "); if ((level == 4 && pr & _PAGE_PAT) || ((level == 3 || level == 2) && pr & _PAGE_PAT_LARGE)) - pt_dump_cont_printf(m, dmsg, "pat "); + pt_dump_cont_printf(m, dmsg, "PAT "); else pt_dump_cont_printf(m, dmsg, " "); if (pr & _PAGE_GLOBAL) @@ -198,8 +198,8 @@ static void note_page(struct seq_file *m, struct pg_state *st, * we have now. "break" is either changing perms, levels or * address space marker. */ - prot = pgprot_val(new_prot) & PTE_FLAGS_MASK; - cur = pgprot_val(st->current_prot) & PTE_FLAGS_MASK; + prot = pgprot_val(new_prot); + cur = pgprot_val(st->current_prot); if (!st->level) { /* First entry */ @@ -269,13 +269,13 @@ static void walk_pte_level(struct seq_file *m, struct pg_state *st, pmd_t addr, { int i; pte_t *start; + pgprotval_t prot; start = (pte_t *) pmd_page_vaddr(addr); for (i = 0; i < PTRS_PER_PTE; i++) { - pgprot_t prot = pte_pgprot(*start); - + prot = pte_flags(*start); st->current_address = normalize_addr(P + i * PTE_LEVEL_MULT); - note_page(m, st, prot, 4); + note_page(m, st, __pgprot(prot), 4); start++; } } @@ -287,18 +287,19 @@ static void walk_pmd_level(struct seq_file *m, struct pg_state *st, pud_t addr, { int i; pmd_t *start; + pgprotval_t prot; start = (pmd_t *) pud_page_vaddr(addr); for (i = 0; i < PTRS_PER_PMD; i++) { st->current_address = normalize_addr(P + i * PMD_LEVEL_MULT); if (!pmd_none(*start)) { - pgprotval_t prot = pmd_val(*start) & PTE_FLAGS_MASK; - - if (pmd_large(*start) || !pmd_present(*start)) + if (pmd_large(*start) || !pmd_present(*start)) { + prot = pmd_flags(*start); note_page(m, st, __pgprot(prot), 3); - else + } else { walk_pte_level(m, st, *start, P + i * PMD_LEVEL_MULT); + } } else note_page(m, st, __pgprot(0), 3); start++; @@ -318,19 +319,20 @@ static void walk_pud_level(struct seq_file *m, struct pg_state *st, pgd_t addr, { int i; pud_t *start; + pgprotval_t prot; start = (pud_t *) pgd_page_vaddr(addr); for (i = 0; i < PTRS_PER_PUD; i++) { st->current_address = normalize_addr(P + i * PUD_LEVEL_MULT); if (!pud_none(*start)) { - pgprotval_t prot = pud_val(*start) & PTE_FLAGS_MASK; - - if (pud_large(*start) || !pud_present(*start)) + if (pud_large(*start) || !pud_present(*start)) { + prot = pud_flags(*start); note_page(m, st, __pgprot(prot), 2); - else + } else { walk_pmd_level(m, st, *start, P + i * PUD_LEVEL_MULT); + } } else note_page(m, st, __pgprot(0), 2); @@ -351,6 +353,7 @@ void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd) #else pgd_t *start = swapper_pg_dir; #endif + pgprotval_t prot; int i; struct pg_state st = {}; @@ -362,13 +365,13 @@ void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd) for (i = 0; i < PTRS_PER_PGD; i++) { st.current_address = normalize_addr(i * PGD_LEVEL_MULT); if (!pgd_none(*start)) { - pgprotval_t prot = pgd_val(*start) & PTE_FLAGS_MASK; - - if (pgd_large(*start) || !pgd_present(*start)) + if (pgd_large(*start) || !pgd_present(*start)) { + prot = pgd_flags(*start); note_page(m, &st, __pgprot(prot), 1); - else + } else { walk_pud_level(m, &st, *start, i * PGD_LEVEL_MULT); + } } else note_page(m, &st, __pgprot(0), 1); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by kanga.kvack.org (Postfix) with ESMTP id 970676B0038 for ; Thu, 9 Jul 2015 23:57:12 -0400 (EDT) Received: by widjy10 with SMTP id jy10so4691691wid.1 for ; Thu, 09 Jul 2015 20:57:12 -0700 (PDT) Received: from mx2.suse.de (cantor2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id cd7si1319192wib.4.2015.07.09.20.57.10 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 09 Jul 2015 20:57:10 -0700 (PDT) Message-ID: <559F4293.1090801@suse.com> Date: Fri, 10 Jul 2015 05:57:07 +0200 From: Juergen Gross MIME-Version: 1.0 Subject: Re: [PATCH 1/2] x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> <1436461431-27305-2-git-send-email-toshi.kani@hp.com> In-Reply-To: <1436461431-27305-2-git-send-email-toshi.kani@hp.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Toshi Kani , hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, konrad.wilk@oracle.com, elliott@hp.com On 07/09/2015 07:03 PM, Toshi Kani wrote: > The PAT bit gets relocated to bit 12 when PUD and PMD mappings are > used. This bit 12, however, is not covered by PTE_FLAGS_MASK, which > is corrently used for masking the flag bits for all cases. > > Fix pud_flags() and pmd_flags() to cover the PAT bit, _PAGE_PAT_LARGE, > when they are used to map a large page with _PAGE_PSE set. > > Signed-off-by: Toshi Kani > Cc: Juergen Gross > Cc: Konrad Wilk > Cc: Robert Elliott > Cc: Thomas Gleixner > Cc: H. Peter Anvin > Cc: Ingo Molnar > Cc: Borislav Petkov > Cc: Andrew Morton > --- > arch/x86/include/asm/pgtable_types.h | 16 +++++++++++++--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h > index 13f310b..caaf45c 100644 > --- a/arch/x86/include/asm/pgtable_types.h > +++ b/arch/x86/include/asm/pgtable_types.h > @@ -212,9 +212,13 @@ enum page_cache_mode { > /* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */ > #define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK) > > -/* PTE_FLAGS_MASK extracts the flags from a (pte|pmd|pud|pgd)val_t */ > +/* Extracts the flags from a (pte|pmd|pud|pgd)val_t of a 4KB page */ > #define PTE_FLAGS_MASK (~PTE_PFN_MASK) > > +/* Extracts the flags from a (pmd|pud)val_t of a (1GB|2MB) page */ > +#define PMD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) > +#define PUD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) > + > typedef struct pgprot { pgprotval_t pgprot; } pgprot_t; > > typedef struct { pgdval_t pgd; } pgd_t; > @@ -278,12 +282,18 @@ static inline pmdval_t native_pmd_val(pmd_t pmd) > > static inline pudval_t pud_flags(pud_t pud) > { > - return native_pud_val(pud) & PTE_FLAGS_MASK; > + if (native_pud_val(pud) & _PAGE_PSE) > + return native_pud_val(pud) & PUD_FLAGS_MASK_LARGE; > + else > + return native_pud_val(pud) & PTE_FLAGS_MASK; > } > > static inline pmdval_t pmd_flags(pmd_t pmd) > { > - return native_pmd_val(pmd) & PTE_FLAGS_MASK; > + if (native_pmd_val(pmd) & _PAGE_PSE) > + return native_pmd_val(pmd) & PMD_FLAGS_MASK_LARGE; > + else > + return native_pmd_val(pmd) & PTE_FLAGS_MASK; > } Hmm, I think this covers only half of the problem. pud_pfn() and pmd_pfn() will return wrong results for large pages with PAT bit set as well. I'd rather use something like: static inline unsigned long pmd_pfn_mask(pmd_t pmd) { if (pmd_large(pmd)) return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK; else return PTE_PFN_MASK; } static inline unsigned long pmd_flags_mask(pmd_t pmd) { if (pmd_large(pmd)) return ~(PMD_PAGE_MASK & PHYSICAL_PAGE_MASK); else return ~PTE_PFN_MASK; } static inline unsigned long pmd_pfn(pmd_t pmd) { return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT; } static inline pmdval_t pmd_flags(pmd_t pmd) { return native_pmd_val(pmd) & ~pmd_flags_mask(pmd); } Juergen -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f53.google.com (mail-oi0-f53.google.com [209.85.218.53]) by kanga.kvack.org (Postfix) with ESMTP id 4BDD36B0253 for ; Fri, 10 Jul 2015 17:16:23 -0400 (EDT) Received: by oiyy130 with SMTP id y130so219763369oiy.0 for ; Fri, 10 Jul 2015 14:16:23 -0700 (PDT) Received: from g1t5424.austin.hp.com (g1t5424.austin.hp.com. [15.216.225.54]) by mx.google.com with ESMTPS id b5si7676767oej.10.2015.07.10.14.16.22 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 10 Jul 2015 14:16:22 -0700 (PDT) Message-ID: <1436562922.3214.124.camel@hp.com> Subject: Re: [PATCH 1/2] x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE From: Toshi Kani Date: Fri, 10 Jul 2015 15:15:22 -0600 In-Reply-To: <559F4293.1090801@suse.com> References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> <1436461431-27305-2-git-send-email-toshi.kani@hp.com> <559F4293.1090801@suse.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Juergen Gross , hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, konrad.wilk@oracle.com, elliott@hp.com On Fri, 2015-07-10 at 05:57 +0200, Juergen Gross wrote: > On 07/09/2015 07:03 PM, Toshi Kani wrote: > > The PAT bit gets relocated to bit 12 when PUD and PMD mappings are > > used. This bit 12, however, is not covered by PTE_FLAGS_MASK, > > which > > is corrently used for masking the flag bits for all cases. > > > > Fix pud_flags() and pmd_flags() to cover the PAT bit, > > _PAGE_PAT_LARGE, > > when they are used to map a large page with _PAGE_PSE set. : > Hmm, I think this covers only half of the problem. pud_pfn() and > pmd_pfn() will return wrong results for large pages with PAT bit > set as well. > > I'd rather use something like: > > static inline unsigned long pmd_pfn_mask(pmd_t pmd) > { > if (pmd_large(pmd)) > return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK; > else > return PTE_PFN_MASK; > } > > static inline unsigned long pmd_flags_mask(pmd_t pmd) > { > if (pmd_large(pmd)) > return ~(PMD_PAGE_MASK & PHYSICAL_PAGE_MASK); > else > return ~PTE_PFN_MASK; > } > > static inline unsigned long pmd_pfn(pmd_t pmd) > { > return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT; > } > > static inline pmdval_t pmd_flags(pmd_t pmd) > { > return native_pmd_val(pmd) & ~pmd_flags_mask(pmd); > } Thanks for the suggestion! I agree that it is cleaner in this way. I am updating the patches and found the following changes are needed as well: - Define PGTABLE_LEVELS to 2 in "arch/x86/entry/vdso/vdso32/vclock_gettime.c". This file redefines to X86_32. Setting to 2 levels (since X86_PAE is not set) allows be included to define PMD_SHIFT. - Move PUD_PAGE_SIZE & PUD_PAGE_MASK from to . This allows X86_32 to refer the PUD macros. - Nit: pmd_large() cannot be used in pmd_xxx_mask() since it calls pmd_flags(). Use (native_pud_val(pud) & _PAGE_PSE), instead. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753368AbbGIRE5 (ORCPT ); Thu, 9 Jul 2015 13:04:57 -0400 Received: from g4t3427.houston.hp.com ([15.201.208.55]:46040 "EHLO g4t3427.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751164AbbGIREu (ORCPT ); Thu, 9 Jul 2015 13:04:50 -0400 From: Toshi Kani To: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com, konrad.wilk@oracle.com, elliott@hp.com Subject: [PATCH 0/2] x86, mm: Fix PAT bit handling of large pages Date: Thu, 9 Jul 2015 11:03:49 -0600 Message-Id: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> X-Mailer: git-send-email 2.4.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The PAT bit gets relocated to bit 12 when PUD and PMD mappings are used. This bit 12, however, is not covered by PTE_FLAGS_MASK, which is corrently used for masking the flag bits for all cases. Patch 1/2 fixes pud_flags() and pmd_flags() to handle the PAT bit when PUD and PMD mappings are used. Patch 2/2 fixes /sys/kernel/debug/kernel_page_tables to show the PAT bit properly. Note, the PAT bit is first enabled in 4.2-rc1 with WT mappings. --- Toshi Kani (2): 1/2 x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE 2/2 x86, mm: Fix page table dump to show PAT bit --- arch/x86/include/asm/pgtable_types.h | 16 ++++++++++++--- arch/x86/mm/dump_pagetables.c | 39 +++++++++++++++++++----------------- 2 files changed, 34 insertions(+), 21 deletions(-) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753692AbbGIRFL (ORCPT ); Thu, 9 Jul 2015 13:05:11 -0400 Received: from g9t5009.houston.hp.com ([15.240.92.67]:52486 "EHLO g9t5009.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751915AbbGIREw (ORCPT ); Thu, 9 Jul 2015 13:04:52 -0400 From: Toshi Kani To: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com, konrad.wilk@oracle.com, elliott@hp.com, Toshi Kani Subject: [PATCH 1/2] x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE Date: Thu, 9 Jul 2015 11:03:50 -0600 Message-Id: <1436461431-27305-2-git-send-email-toshi.kani@hp.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The PAT bit gets relocated to bit 12 when PUD and PMD mappings are used. This bit 12, however, is not covered by PTE_FLAGS_MASK, which is corrently used for masking the flag bits for all cases. Fix pud_flags() and pmd_flags() to cover the PAT bit, _PAGE_PAT_LARGE, when they are used to map a large page with _PAGE_PSE set. Signed-off-by: Toshi Kani Cc: Juergen Gross Cc: Konrad Wilk Cc: Robert Elliott Cc: Thomas Gleixner Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton --- arch/x86/include/asm/pgtable_types.h | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index 13f310b..caaf45c 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -212,9 +212,13 @@ enum page_cache_mode { /* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */ #define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK) -/* PTE_FLAGS_MASK extracts the flags from a (pte|pmd|pud|pgd)val_t */ +/* Extracts the flags from a (pte|pmd|pud|pgd)val_t of a 4KB page */ #define PTE_FLAGS_MASK (~PTE_PFN_MASK) +/* Extracts the flags from a (pmd|pud)val_t of a (1GB|2MB) page */ +#define PMD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) +#define PUD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) + typedef struct pgprot { pgprotval_t pgprot; } pgprot_t; typedef struct { pgdval_t pgd; } pgd_t; @@ -278,12 +282,18 @@ static inline pmdval_t native_pmd_val(pmd_t pmd) static inline pudval_t pud_flags(pud_t pud) { - return native_pud_val(pud) & PTE_FLAGS_MASK; + if (native_pud_val(pud) & _PAGE_PSE) + return native_pud_val(pud) & PUD_FLAGS_MASK_LARGE; + else + return native_pud_val(pud) & PTE_FLAGS_MASK; } static inline pmdval_t pmd_flags(pmd_t pmd) { - return native_pmd_val(pmd) & PTE_FLAGS_MASK; + if (native_pmd_val(pmd) & _PAGE_PSE) + return native_pmd_val(pmd) & PMD_FLAGS_MASK_LARGE; + else + return native_pmd_val(pmd) & PTE_FLAGS_MASK; } static inline pte_t native_make_pte(pteval_t val) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753566AbbGIRFF (ORCPT ); Thu, 9 Jul 2015 13:05:05 -0400 Received: from g9t5008.houston.hp.com ([15.240.92.66]:37522 "EHLO g9t5008.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753058AbbGIREw (ORCPT ); Thu, 9 Jul 2015 13:04:52 -0400 From: Toshi Kani To: hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com, konrad.wilk@oracle.com, elliott@hp.com, Toshi Kani Subject: [PATCH 2/2] x86, mm: Fix page table dump to show PAT bit Date: Thu, 9 Jul 2015 11:03:51 -0600 Message-Id: <1436461431-27305-3-git-send-email-toshi.kani@hp.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org /sys/kernel/debug/kernel_page_tables does not show the PAT bit for PUD and PMD mappings. This is because walk_pud_level(), walk_pmd_level() and note_page() mask the flags with PTE_FLAGS_MASK, which does not cover their PAT bit, _PAGE_PAT_LARGE. Fix it by replacing the use of PTE_FLAGS_MASK with pXd_flags(), which mask the flags properly. Change also to show the PAT bit as "PAT" to be consistent with other bits. Reported-by: Robert Elliott Signed-off-by: Toshi Kani Cc: Juergen Gross Cc: Konrad Wilk Cc: Robert Elliott Cc: Thomas Gleixner Cc: H. Peter Anvin Cc: Ingo Molnar Cc: Borislav Petkov Cc: Andrew Morton --- arch/x86/mm/dump_pagetables.c | 39 +++++++++++++++++++++------------------ 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c index f0cedf3..71ab2d7 100644 --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -155,7 +155,7 @@ static void printk_prot(struct seq_file *m, pgprot_t prot, int level, bool dmsg) pt_dump_cont_printf(m, dmsg, " "); if ((level == 4 && pr & _PAGE_PAT) || ((level == 3 || level == 2) && pr & _PAGE_PAT_LARGE)) - pt_dump_cont_printf(m, dmsg, "pat "); + pt_dump_cont_printf(m, dmsg, "PAT "); else pt_dump_cont_printf(m, dmsg, " "); if (pr & _PAGE_GLOBAL) @@ -198,8 +198,8 @@ static void note_page(struct seq_file *m, struct pg_state *st, * we have now. "break" is either changing perms, levels or * address space marker. */ - prot = pgprot_val(new_prot) & PTE_FLAGS_MASK; - cur = pgprot_val(st->current_prot) & PTE_FLAGS_MASK; + prot = pgprot_val(new_prot); + cur = pgprot_val(st->current_prot); if (!st->level) { /* First entry */ @@ -269,13 +269,13 @@ static void walk_pte_level(struct seq_file *m, struct pg_state *st, pmd_t addr, { int i; pte_t *start; + pgprotval_t prot; start = (pte_t *) pmd_page_vaddr(addr); for (i = 0; i < PTRS_PER_PTE; i++) { - pgprot_t prot = pte_pgprot(*start); - + prot = pte_flags(*start); st->current_address = normalize_addr(P + i * PTE_LEVEL_MULT); - note_page(m, st, prot, 4); + note_page(m, st, __pgprot(prot), 4); start++; } } @@ -287,18 +287,19 @@ static void walk_pmd_level(struct seq_file *m, struct pg_state *st, pud_t addr, { int i; pmd_t *start; + pgprotval_t prot; start = (pmd_t *) pud_page_vaddr(addr); for (i = 0; i < PTRS_PER_PMD; i++) { st->current_address = normalize_addr(P + i * PMD_LEVEL_MULT); if (!pmd_none(*start)) { - pgprotval_t prot = pmd_val(*start) & PTE_FLAGS_MASK; - - if (pmd_large(*start) || !pmd_present(*start)) + if (pmd_large(*start) || !pmd_present(*start)) { + prot = pmd_flags(*start); note_page(m, st, __pgprot(prot), 3); - else + } else { walk_pte_level(m, st, *start, P + i * PMD_LEVEL_MULT); + } } else note_page(m, st, __pgprot(0), 3); start++; @@ -318,19 +319,20 @@ static void walk_pud_level(struct seq_file *m, struct pg_state *st, pgd_t addr, { int i; pud_t *start; + pgprotval_t prot; start = (pud_t *) pgd_page_vaddr(addr); for (i = 0; i < PTRS_PER_PUD; i++) { st->current_address = normalize_addr(P + i * PUD_LEVEL_MULT); if (!pud_none(*start)) { - pgprotval_t prot = pud_val(*start) & PTE_FLAGS_MASK; - - if (pud_large(*start) || !pud_present(*start)) + if (pud_large(*start) || !pud_present(*start)) { + prot = pud_flags(*start); note_page(m, st, __pgprot(prot), 2); - else + } else { walk_pmd_level(m, st, *start, P + i * PUD_LEVEL_MULT); + } } else note_page(m, st, __pgprot(0), 2); @@ -351,6 +353,7 @@ void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd) #else pgd_t *start = swapper_pg_dir; #endif + pgprotval_t prot; int i; struct pg_state st = {}; @@ -362,13 +365,13 @@ void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd) for (i = 0; i < PTRS_PER_PGD; i++) { st.current_address = normalize_addr(i * PGD_LEVEL_MULT); if (!pgd_none(*start)) { - pgprotval_t prot = pgd_val(*start) & PTE_FLAGS_MASK; - - if (pgd_large(*start) || !pgd_present(*start)) + if (pgd_large(*start) || !pgd_present(*start)) { + prot = pgd_flags(*start); note_page(m, &st, __pgprot(prot), 1); - else + } else { walk_pud_level(m, &st, *start, i * PGD_LEVEL_MULT); + } } else note_page(m, &st, __pgprot(0), 1); From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752463AbbGJD5V (ORCPT ); Thu, 9 Jul 2015 23:57:21 -0400 Received: from cantor2.suse.de ([195.135.220.15]:58719 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751622AbbGJD5L (ORCPT ); Thu, 9 Jul 2015 23:57:11 -0400 Message-ID: <559F4293.1090801@suse.com> Date: Fri, 10 Jul 2015 05:57:07 +0200 From: Juergen Gross User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Toshi Kani , hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com CC: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, konrad.wilk@oracle.com, elliott@hp.com Subject: Re: [PATCH 1/2] x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> <1436461431-27305-2-git-send-email-toshi.kani@hp.com> In-Reply-To: <1436461431-27305-2-git-send-email-toshi.kani@hp.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/09/2015 07:03 PM, Toshi Kani wrote: > The PAT bit gets relocated to bit 12 when PUD and PMD mappings are > used. This bit 12, however, is not covered by PTE_FLAGS_MASK, which > is corrently used for masking the flag bits for all cases. > > Fix pud_flags() and pmd_flags() to cover the PAT bit, _PAGE_PAT_LARGE, > when they are used to map a large page with _PAGE_PSE set. > > Signed-off-by: Toshi Kani > Cc: Juergen Gross > Cc: Konrad Wilk > Cc: Robert Elliott > Cc: Thomas Gleixner > Cc: H. Peter Anvin > Cc: Ingo Molnar > Cc: Borislav Petkov > Cc: Andrew Morton > --- > arch/x86/include/asm/pgtable_types.h | 16 +++++++++++++--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h > index 13f310b..caaf45c 100644 > --- a/arch/x86/include/asm/pgtable_types.h > +++ b/arch/x86/include/asm/pgtable_types.h > @@ -212,9 +212,13 @@ enum page_cache_mode { > /* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */ > #define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK) > > -/* PTE_FLAGS_MASK extracts the flags from a (pte|pmd|pud|pgd)val_t */ > +/* Extracts the flags from a (pte|pmd|pud|pgd)val_t of a 4KB page */ > #define PTE_FLAGS_MASK (~PTE_PFN_MASK) > > +/* Extracts the flags from a (pmd|pud)val_t of a (1GB|2MB) page */ > +#define PMD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) > +#define PUD_FLAGS_MASK_LARGE ((~PTE_PFN_MASK) | _PAGE_PAT_LARGE) > + > typedef struct pgprot { pgprotval_t pgprot; } pgprot_t; > > typedef struct { pgdval_t pgd; } pgd_t; > @@ -278,12 +282,18 @@ static inline pmdval_t native_pmd_val(pmd_t pmd) > > static inline pudval_t pud_flags(pud_t pud) > { > - return native_pud_val(pud) & PTE_FLAGS_MASK; > + if (native_pud_val(pud) & _PAGE_PSE) > + return native_pud_val(pud) & PUD_FLAGS_MASK_LARGE; > + else > + return native_pud_val(pud) & PTE_FLAGS_MASK; > } > > static inline pmdval_t pmd_flags(pmd_t pmd) > { > - return native_pmd_val(pmd) & PTE_FLAGS_MASK; > + if (native_pmd_val(pmd) & _PAGE_PSE) > + return native_pmd_val(pmd) & PMD_FLAGS_MASK_LARGE; > + else > + return native_pmd_val(pmd) & PTE_FLAGS_MASK; > } Hmm, I think this covers only half of the problem. pud_pfn() and pmd_pfn() will return wrong results for large pages with PAT bit set as well. I'd rather use something like: static inline unsigned long pmd_pfn_mask(pmd_t pmd) { if (pmd_large(pmd)) return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK; else return PTE_PFN_MASK; } static inline unsigned long pmd_flags_mask(pmd_t pmd) { if (pmd_large(pmd)) return ~(PMD_PAGE_MASK & PHYSICAL_PAGE_MASK); else return ~PTE_PFN_MASK; } static inline unsigned long pmd_pfn(pmd_t pmd) { return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT; } static inline pmdval_t pmd_flags(pmd_t pmd) { return native_pmd_val(pmd) & ~pmd_flags_mask(pmd); } Juergen From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932973AbbGJVQb (ORCPT ); Fri, 10 Jul 2015 17:16:31 -0400 Received: from g1t5424.austin.hp.com ([15.216.225.54]:50019 "EHLO g1t5424.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932582AbbGJVQW (ORCPT ); Fri, 10 Jul 2015 17:16:22 -0400 Message-ID: <1436562922.3214.124.camel@hp.com> Subject: Re: [PATCH 1/2] x86: Fix pXd_flags() to handle _PAGE_PAT_LARGE From: Toshi Kani To: Juergen Gross , hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: akpm@linux-foundation.org, bp@alien8.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, konrad.wilk@oracle.com, elliott@hp.com Date: Fri, 10 Jul 2015 15:15:22 -0600 In-Reply-To: <559F4293.1090801@suse.com> References: <1436461431-27305-1-git-send-email-toshi.kani@hp.com> <1436461431-27305-2-git-send-email-toshi.kani@hp.com> <559F4293.1090801@suse.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.3 (3.16.3-2.fc22) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2015-07-10 at 05:57 +0200, Juergen Gross wrote: > On 07/09/2015 07:03 PM, Toshi Kani wrote: > > The PAT bit gets relocated to bit 12 when PUD and PMD mappings are > > used. This bit 12, however, is not covered by PTE_FLAGS_MASK, > > which > > is corrently used for masking the flag bits for all cases. > > > > Fix pud_flags() and pmd_flags() to cover the PAT bit, > > _PAGE_PAT_LARGE, > > when they are used to map a large page with _PAGE_PSE set. : > Hmm, I think this covers only half of the problem. pud_pfn() and > pmd_pfn() will return wrong results for large pages with PAT bit > set as well. > > I'd rather use something like: > > static inline unsigned long pmd_pfn_mask(pmd_t pmd) > { > if (pmd_large(pmd)) > return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK; > else > return PTE_PFN_MASK; > } > > static inline unsigned long pmd_flags_mask(pmd_t pmd) > { > if (pmd_large(pmd)) > return ~(PMD_PAGE_MASK & PHYSICAL_PAGE_MASK); > else > return ~PTE_PFN_MASK; > } > > static inline unsigned long pmd_pfn(pmd_t pmd) > { > return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT; > } > > static inline pmdval_t pmd_flags(pmd_t pmd) > { > return native_pmd_val(pmd) & ~pmd_flags_mask(pmd); > } Thanks for the suggestion! I agree that it is cleaner in this way. I am updating the patches and found the following changes are needed as well: - Define PGTABLE_LEVELS to 2 in "arch/x86/entry/vdso/vdso32/vclock_gettime.c". This file redefines to X86_32. Setting to 2 levels (since X86_PAE is not set) allows be included to define PMD_SHIFT. - Move PUD_PAGE_SIZE & PUD_PAGE_MASK from to . This allows X86_32 to refer the PUD macros. - Nit: pmd_large() cannot be used in pmd_xxx_mask() since it calls pmd_flags(). Use (native_pud_val(pud) & _PAGE_PSE), instead. Thanks, -Toshi