From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f44.google.com (mail-oi0-f44.google.com [209.85.218.44]) by kanga.kvack.org (Postfix) with ESMTP id 432276B0092 for ; Tue, 10 Mar 2015 16:24:06 -0400 (EDT) Received: by oifu20 with SMTP id u20so3882581oif.11 for ; Tue, 10 Mar 2015 13:24:06 -0700 (PDT) Received: from g9t5009.houston.hp.com (g9t5009.houston.hp.com. [15.240.92.67]) by mx.google.com with ESMTPS id y3si917768oer.66.2015.03.10.13.24.05 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Mar 2015 13:24:05 -0700 (PDT) From: Toshi Kani Subject: [PATCH 0/3] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Date: Tue, 10 Mar 2015 14:23:14 -0600 Message-Id: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl This patchset enhances MTRR checks for the kernel huge I/O mapping, which was enabled by the patchset below: https://lkml.org/lkml/2015/3/3/589 The following functional changes are made in patch 3/3. - Allow pud_set_huge() and pmd_set_huge() to create a huge page mapping to a range covered by a single MTRR entry of any memory type. - Log a pr_warn() message when a requested PMD map range spans more than a single MTRR entry. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Patch 1/3 addresses other review comments to the mapping funcs for better code read-ability. Patch 2/3 fixes a bug in mtrr_type_lookup(). The patchset is based on the -mm tree. --- Toshi Kani (3): 1/3 mm, x86: Document return values of mapping funcs 2/3 mtrr, x86: Fix MTRR lookup to handle inclusive entry 3/3 mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping --- arch/x86/Kconfig | 2 +- arch/x86/include/asm/mtrr.h | 5 ++-- arch/x86/kernel/cpu/mtrr/generic.c | 52 +++++++++++++++++++++++++------------ arch/x86/mm/pat.c | 4 +-- arch/x86/mm/pgtable.c | 53 ++++++++++++++++++++++++++++---------- 5 files changed, 81 insertions(+), 35 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f173.google.com (mail-ig0-f173.google.com [209.85.213.173]) by kanga.kvack.org (Postfix) with ESMTP id 2816D6B0092 for ; Tue, 10 Mar 2015 16:24:10 -0400 (EDT) Received: by igbhn18 with SMTP id hn18so7268590igb.2 for ; Tue, 10 Mar 2015 13:24:10 -0700 (PDT) Received: from g1t5424.austin.hp.com (g1t5424.austin.hp.com. [15.216.225.54]) by mx.google.com with ESMTPS id ih18si3507074igb.55.2015.03.10.13.24.09 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Mar 2015 13:24:09 -0700 (PDT) From: Toshi Kani Subject: [PATCH 1/3] mm, x86: Document return values of mapping funcs Date: Tue, 10 Mar 2015 14:23:15 -0600 Message-Id: <1426018997-12936-2-git-send-email-toshi.kani@hp.com> In-Reply-To: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, Toshi Kani Documented the return values of KVA mapping functions, pud_set_huge(), pmd_set_huge, pud_clear_huge() and pmd_clear_huge(). Simplified the conditions to select HAVE_ARCH_HUGE_VMAP in Kconfig since X86_PAE depends on X86_32. There is no functinal change in this patch. Signed-off-by: Toshi Kani --- arch/x86/Kconfig | 2 +- arch/x86/mm/pgtable.c | 36 ++++++++++++++++++++++++++++-------- 2 files changed, 29 insertions(+), 9 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 110f6ae..ba5e78e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -99,7 +99,7 @@ config X86 select IRQ_FORCED_THREADING select HAVE_BPF_JIT if X86_64 select HAVE_ARCH_TRANSPARENT_HUGEPAGE - select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE) + select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE select ARCH_HAS_SG_CHAIN select CLKEVT_I8253 select ARCH_HAVE_NMI_SAFE_CMPXCHG diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 0b97d2c..a0f7eeb 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -563,14 +563,19 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, } #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP +/** + * pud_set_huge - setup kernel PUD mapping + * + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, + * it does not set up a huge page when the range is covered by non-WB type + * of MTRRs. 0xFF indicates that MTRRs are disabled. + * + * Return 1 on success, and 0 on no-operation. + */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) return 0; @@ -584,14 +589,19 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pmd_set_huge - setup kernel PMD mapping + * + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, + * it does not set up a huge page when the range is covered by non-WB type + * of MTRRs. 0xFF indicates that MTRRs are disabled. + * + * Return 1 on success, and 0 on no-operation. + */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) return 0; @@ -605,6 +615,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pud_clear_huge - clear kernel PUD mapping when it is set + * + * Return 1 on success, and 0 on no-operation. + */ int pud_clear_huge(pud_t *pud) { if (pud_large(*pud)) { @@ -615,6 +630,11 @@ int pud_clear_huge(pud_t *pud) return 0; } +/** + * pmd_clear_huge - clear kernel PMD mapping when it is set + * + * Return 1 on success, and 0 on no-operation. + */ int pmd_clear_huge(pmd_t *pmd) { if (pmd_large(*pmd)) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-f175.google.com (mail-ie0-f175.google.com [209.85.223.175]) by kanga.kvack.org (Postfix) with ESMTP id 101E26B0092 for ; Tue, 10 Mar 2015 16:24:11 -0400 (EDT) Received: by iecsf10 with SMTP id sf10so29037531iec.2 for ; Tue, 10 Mar 2015 13:24:10 -0700 (PDT) Received: from g1t5424.austin.hp.com (g1t5424.austin.hp.com. [15.216.225.54]) by mx.google.com with ESMTPS id ey4si1377493icb.9.2015.03.10.13.24.10 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Mar 2015 13:24:10 -0700 (PDT) From: Toshi Kani Subject: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry Date: Tue, 10 Mar 2015 14:23:16 -0600 Message-Id: <1426018997-12936-3-git-send-email-toshi.kani@hp.com> In-Reply-To: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, Toshi Kani When an MTRR entry is inclusive to a requested range, i.e. the start and end of the request are not within the MTRR entry range but the range contains the MTRR entry entirely, __mtrr_type_lookup() ignores such case because both start_state and end_state are set to zero. This patch fixes the issue by adding a new flag, inclusive, to detect the case. This case is then handled in the same way as (!start_state && end_state). Also updated the comment in __mtrr_type_lookup() to clarify that the repeat handling is necessary to handle overlaps with the default type, since overlaps with multiple entries alone can be handled without such repeat. Signed-off-by: Toshi Kani --- arch/x86/kernel/cpu/mtrr/generic.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 7d74f7b..cdb955f 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) prev_match = 0xFF; for (i = 0; i < num_var_ranges; ++i) { - unsigned short start_state, end_state; + unsigned short start_state, end_state, inclusive; if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11))) continue; @@ -166,20 +166,22 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) start_state = ((start & mask) == (base & mask)); end_state = ((end & mask) == (base & mask)); + inclusive = ((start < base) && (end > base)); - if (start_state != end_state) { + if ((start_state != end_state) || inclusive) { /* * We have start:end spanning across an MTRR. - * We split the region into - * either - * (start:mtrr_end) (mtrr_end:end) - * or - * (start:mtrr_start) (mtrr_start:end) + * We split the region into either + * - start_state:1 + * (start:mtrr_end) (mtrr_end:end) + * - end_state:1 or inclusive:1 + * (start:mtrr_start) (mtrr_start:end) * depending on kind of overlap. * Return the type for first region and a pointer to * the start of second region so that caller will * lookup again on the second region. - * Note: This way we handle multiple overlaps as well. + * Note: This way we handle overlaps with multiple + * entries and the default type properly. */ if (start_state) *partial_end = base + get_mtrr_size(mask); @@ -195,7 +197,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) *repeat = 1; } - if ((start & mask) != (base & mask)) + if (!start_state) continue; curr_match = mtrr_state.var_ranges[i].base_lo & 0xff; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f175.google.com (mail-ig0-f175.google.com [209.85.213.175]) by kanga.kvack.org (Postfix) with ESMTP id 4A6B990002C for ; Tue, 10 Mar 2015 16:24:13 -0400 (EDT) Received: by igal13 with SMTP id l13so32468805iga.1 for ; Tue, 10 Mar 2015 13:24:13 -0700 (PDT) Received: from g1t5424.austin.hp.com (g1t5424.austin.hp.com. [15.216.225.54]) by mx.google.com with ESMTPS id sb9si3534737igb.33.2015.03.10.13.24.12 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Mar 2015 13:24:12 -0700 (PDT) From: Toshi Kani Subject: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Tue, 10 Mar 2015 14:23:17 -0600 Message-Id: <1426018997-12936-4-git-send-email-toshi.kani@hp.com> In-Reply-To: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, Toshi Kani This patch adds an additional argument, *uniform, to mtrr_type_lookup(), which returns 1 when a given range is either fully covered by a single MTRR entry or not covered at all. pud_set_huge() and pmd_set_huge() are changed to check the new uniform flag to see if it is safe to create a huge page mapping to the range. This allows them to create a huge page mapping to a range covered by a single MTRR entry of any memory type. It also detects an unoptimal request properly. They continue to check with the WB type since the WB type has no effect even if a request spans to multiple MTRR entries. pmd_set_huge() logs a warning message to an unoptimal request so that driver writers will be aware of such case. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Signed-off-by: Toshi Kani --- arch/x86/include/asm/mtrr.h | 5 +++-- arch/x86/kernel/cpu/mtrr/generic.c | 32 +++++++++++++++++++++++++------- arch/x86/mm/pat.c | 4 ++-- arch/x86/mm/pgtable.c | 25 +++++++++++++++---------- 4 files changed, 45 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index f768f62..5b4d467 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -31,7 +31,7 @@ * arch_phys_wc_add and arch_phys_wc_del. */ # ifdef CONFIG_MTRR -extern u8 mtrr_type_lookup(u64 addr, u64 end); +extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); extern int mtrr_add(unsigned long base, unsigned long size, @@ -50,11 +50,12 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn); extern int amd_special_default_mtrr(void); extern int phys_wc_to_mtrr_index(int handle); # else -static inline u8 mtrr_type_lookup(u64 addr, u64 end) +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) { /* * Return no-MTRRs: */ + *uniform = 1; return 0xff; } #define mtrr_save_fixed_ranges(arg) do {} while (0) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index cdb955f..aef238c 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr) * *repeat == 1 implies [start:end] spanned across MTRR range and type returned * corresponds only to [start:*partial_end]. * Caller has to lookup again for [*partial_end:end]. + * *uniform == 1 The requested range is either fully covered by a single MTRR + * entry or not covered at all. */ -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) +static u8 __mtrr_type_lookup(u64 start, u64 end, + u64 *partial_end, int *repeat, u8 *uniform) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; + *uniform = 1; + if (!mtrr_state_set) return 0xFF; @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) /* Look in fixed ranges. Just return the type as per start */ if (mtrr_state.have_fixed && (start < 0x100000)) { int idx; + *uniform = 0; if (start < 0x80000) { idx = 0; @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) end = *partial_end - 1; /* end is inclusive */ *repeat = 1; + *uniform = 0; } if (!start_state) @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) continue; } + *uniform = 0; if (check_type_overlap(&prev_match, &curr_match)) return curr_match; } @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) } /* - * Returns the effective MTRR type for the region + * Returns the effective MTRR type for the region. *uniform is set to 1 + * when a given range is either fully covered by a single MTRR entry or + * not covered at all. + * * Error return: * 0xFF - when MTRR is not enabled */ -u8 mtrr_type_lookup(u64 start, u64 end) +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) { - u8 type, prev_type; + u8 type, prev_type, is_uniform, dummy; int repeat; u64 partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + type = __mtrr_type_lookup(start, end, + &partial_end, &repeat, &is_uniform); /* * Common path is with repeat = 0. @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) while (repeat) { prev_type = type; start = partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + is_uniform = 0; - if (check_type_overlap(&prev_type, &type)) + type = __mtrr_type_lookup(start, end, + &partial_end, &repeat, &dummy); + + if (check_type_overlap(&prev_type, &type)) { + *uniform = 0; return type; + } } + *uniform = is_uniform; return type; } diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 35af677..372ad42 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end, * request is for WB. */ if (req_type == _PAGE_CACHE_MODE_WB) { - u8 mtrr_type; + u8 mtrr_type, uniform; - mtrr_type = mtrr_type_lookup(start, end); + mtrr_type = mtrr_type_lookup(start, end, &uniform); if (mtrr_type != MTRR_TYPE_WRBACK) return _PAGE_CACHE_MODE_UC_MINUS; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index a0f7eeb..25843a9 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -567,17 +567,18 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, * pud_set_huge - setup kernel PUD mapping * * MTRRs can override PAT memory types with a 4KB granularity. Therefore, - * it does not set up a huge page when the range is covered by non-WB type - * of MTRRs. 0xFF indicates that MTRRs are disabled. + * it only sets up a huge page when the range is mapped uniformly (i.e. + * either fully covered by a single MTRR entry or not covered at all) or + * the MTRR type is WB. * * Return 1 on success, and 0 on no-operation. */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform); + if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK)) return 0; prot = pgprot_4k_2_large(prot); @@ -593,18 +594,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) * pmd_set_huge - setup kernel PMD mapping * * MTRRs can override PAT memory types with a 4KB granularity. Therefore, - * it does not set up a huge page when the range is covered by non-WB type - * of MTRRs. 0xFF indicates that MTRRs are disabled. + * it only sets up a huge page when the range is mapped uniformly (i.e. + * either fully covered by a single MTRR entry or not covered at all) or + * the MTRR type is WB. * * Return 1 on success, and 0 on no-operation. */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); + if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK)) { + pr_warn("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n", + addr, addr + PMD_SIZE); return 0; + } prot = pgprot_4k_2_large(prot); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com [209.85.212.180]) by kanga.kvack.org (Postfix) with ESMTP id AAF8590002E for ; Wed, 11 Mar 2015 02:30:30 -0400 (EDT) Received: by wiwh11 with SMTP id h11so8815198wiw.5 for ; Tue, 10 Mar 2015 23:30:30 -0700 (PDT) Received: from mail-we0-x22a.google.com (mail-we0-x22a.google.com. [2a00:1450:400c:c03::22a]) by mx.google.com with ESMTPS id bp16si7560292wib.122.2015.03.10.23.30.28 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Mar 2015 23:30:29 -0700 (PDT) Received: by wevk48 with SMTP id k48so6804384wev.7 for ; Tue, 10 Mar 2015 23:30:28 -0700 (PDT) Date: Wed, 11 Mar 2015 07:30:25 +0100 From: Ingo Molnar Subject: Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs Message-ID: <20150311063024.GB29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-2-git-send-email-toshi.kani@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426018997-12936-2-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl * Toshi Kani wrote: > Documented the return values of KVA mapping functions, > pud_set_huge(), pmd_set_huge, pud_clear_huge() and > pmd_clear_huge(). > > Simplified the conditions to select HAVE_ARCH_HUGE_VMAP > in Kconfig since X86_PAE depends on X86_32. Changelogs are not a diary, they are a story, generally written in the present tense. So it should be something like: Document the return values of KVA mapping functions, pud_set_huge(), pmd_set_huge, pud_clear_huge() and pmd_clear_huge(). Simplify the conditions to select HAVE_ARCH_HUGE_VMAP in the Kconfig, since X86_PAE depends on X86_32. (also note the slight fixes I made to the text.) > There is no functinal change in this patch. Typo. > +/** > + * pud_set_huge - setup kernel PUD mapping > + * > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, s/with a/with > + * it does not set up a huge page when the range is covered by non-WB type > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > + * > + * Return 1 on success, and 0 on no-operation. What is a 'no-operation'? I suspect you want: * Returns 1 on success, and 0 when no PUD was set. > +/** > + * pmd_set_huge - setup kernel PMD mapping > + * > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, > + * it does not set up a huge page when the range is covered by non-WB type > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > + * > + * Return 1 on success, and 0 on no-operation. Ditto (and the rest of the patch). Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f41.google.com (mail-wg0-f41.google.com [74.125.82.41]) by kanga.kvack.org (Postfix) with ESMTP id DF30090002E for ; Wed, 11 Mar 2015 02:32:11 -0400 (EDT) Received: by wghn12 with SMTP id n12so6854181wgh.6 for ; Tue, 10 Mar 2015 23:32:11 -0700 (PDT) Received: from mail-wi0-x231.google.com (mail-wi0-x231.google.com. [2a00:1450:400c:c05::231]) by mx.google.com with ESMTPS id c8si4149930wjw.102.2015.03.10.23.32.10 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Mar 2015 23:32:10 -0700 (PDT) Received: by wiwh11 with SMTP id h11so8823782wiw.5 for ; Tue, 10 Mar 2015 23:32:10 -0700 (PDT) Date: Wed, 11 Mar 2015 07:32:05 +0100 From: Ingo Molnar Subject: Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry Message-ID: <20150311063205.GC29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-3-git-send-email-toshi.kani@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426018997-12936-3-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl * Toshi Kani wrote: > When an MTRR entry is inclusive to a requested range, i.e. > the start and end of the request are not within the MTRR > entry range but the range contains the MTRR entry entirely, > __mtrr_type_lookup() ignores such case because both > start_state and end_state are set to zero. 'ignores such a case' or 'ignores such cases'. > This patch fixes the issue by adding a new flag, inclusive, s/inclusive/'inclusive' Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) by kanga.kvack.org (Postfix) with ESMTP id 8C39490002E for ; Wed, 11 Mar 2015 03:02:22 -0400 (EDT) Received: by wiwl15 with SMTP id l15so9073148wiw.0 for ; Wed, 11 Mar 2015 00:02:22 -0700 (PDT) Received: from mail-wg0-x22d.google.com (mail-wg0-x22d.google.com. [2a00:1450:400c:c00::22d]) by mx.google.com with ESMTPS id v7si3126465wiz.62.2015.03.11.00.02.20 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Mar 2015 00:02:20 -0700 (PDT) Received: by wggy19 with SMTP id y19so7006753wgg.2 for ; Wed, 11 Mar 2015 00:02:20 -0700 (PDT) Date: Wed, 11 Mar 2015 08:02:16 +0100 From: Ingo Molnar Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Message-ID: <20150311070216.GD29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426018997-12936-4-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl * Toshi Kani wrote: > This patch adds an additional argument, *uniform, to s/*uniform/'uniform' > mtrr_type_lookup(), which returns 1 when a given range is > either fully covered by a single MTRR entry or not covered > at all. s/or not covered/or is not covered > pud_set_huge() and pmd_set_huge() are changed to check the > new uniform flag to see if it is safe to create a huge page s/uniform/'uniform' > mapping to the range. This allows them to create a huge page > mapping to a range covered by a single MTRR entry of any > memory type. It also detects an unoptimal request properly. s/unoptimal/non-optimal or nonoptimal Also, some description in the changelog about what a 'non-optimal' request is would be most userful. > They continue to check with the WB type since the WB type has > no effect even if a request spans to multiple MTRR entries. s/spans to/spans > -static inline u8 mtrr_type_lookup(u64 addr, u64 end) > +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) > { > /* > * Return no-MTRRs: > */ > + *uniform = 1; > return 0xff; > } > #define mtrr_save_fixed_ranges(arg) do {} while (0) > diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c > index cdb955f..aef238c 100644 > --- a/arch/x86/kernel/cpu/mtrr/generic.c > +++ b/arch/x86/kernel/cpu/mtrr/generic.c > @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr) > * *repeat == 1 implies [start:end] spanned across MTRR range and type returned > * corresponds only to [start:*partial_end]. > * Caller has to lookup again for [*partial_end:end]. > + * *uniform == 1 The requested range is either fully covered by a single MTRR > + * entry or not covered at all. > */ So I think a better approach would be to count the number of separate MTRR caching types a range is covered by, instead of this hard to quality 'uniform' flag. I.e. a 'nr_mtrr_types' count. If for example a range partially intersects with an MTRR, then that count would be 2: the MTRR, and the outside (default cache policy) type. ( Note that with this approach is not only easy to understand and easy to review, but could also be refined in the future, to count the number of _incompatible_ caching types present within a range. ) > -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > +static u8 __mtrr_type_lookup(u64 start, u64 end, > + u64 *partial_end, int *repeat, u8 *uniform) > { > int i; > u64 base, mask; > u8 prev_match, curr_match; > > *repeat = 0; > + *uniform = 1; > + > if (!mtrr_state_set) > return 0xFF; > > @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > /* Look in fixed ranges. Just return the type as per start */ > if (mtrr_state.have_fixed && (start < 0x100000)) { > int idx; > + *uniform = 0; So this function scares me, because the code is clearly crap: if (mtrr_state.have_fixed && (start < 0x100000)) { ... } else if (start < 0x1000000) { ... How can that 'else if' branch ever not be true? Did it perhaps want to be the other way around: if (mtrr_state.have_fixed && (start < 0x1000000)) { ... } else if (start < 0x100000) { ... or did it simply mess up the condition? > > if (start < 0x80000) { > idx = 0; > @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > end = *partial_end - 1; /* end is inclusive */ > *repeat = 1; > + *uniform = 0; > } > > if (!start_state) > @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > continue; > } > > + *uniform = 0; > if (check_type_overlap(&prev_match, &curr_match)) > return curr_match; > } > @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > } > > /* > - * Returns the effective MTRR type for the region > + * Returns the effective MTRR type for the region. *uniform is set to 1 > + * when a given range is either fully covered by a single MTRR entry or > + * not covered at all. > + * > * Error return: > * 0xFF - when MTRR is not enabled > */ > -u8 mtrr_type_lookup(u64 start, u64 end) > +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) > { > - u8 type, prev_type; > + u8 type, prev_type, is_uniform, dummy; > int repeat; > u64 partial_end; > > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > + type = __mtrr_type_lookup(start, end, > + &partial_end, &repeat, &is_uniform); > > /* > * Common path is with repeat = 0. > @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) > while (repeat) { > prev_type = type; > start = partial_end; > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > + is_uniform = 0; > > - if (check_type_overlap(&prev_type, &type)) > + type = __mtrr_type_lookup(start, end, > + &partial_end, &repeat, &dummy); > + > + if (check_type_overlap(&prev_type, &type)) { > + *uniform = 0; > return type; > + } > } > > + *uniform = is_uniform; > return type; So the MTRR code is from hell, it would be nice to first clean up the whole code and the MTRR data structures before extending it with more complexity ... Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f179.google.com (mail-ob0-f179.google.com [209.85.214.179]) by kanga.kvack.org (Postfix) with ESMTP id A91B3900049 for ; Wed, 11 Mar 2015 11:26:15 -0400 (EDT) Received: by obcwp4 with SMTP id wp4so9577937obc.4 for ; Wed, 11 Mar 2015 08:26:15 -0700 (PDT) Received: from g4t3426.houston.hp.com (g4t3426.houston.hp.com. [15.201.208.54]) by mx.google.com with ESMTPS id p10si2275930oeq.29.2015.03.11.08.26.14 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Mar 2015 08:26:15 -0700 (PDT) Message-ID: <1426087526.17007.315.camel@misato.fc.hp.com> Subject: Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs From: Toshi Kani Date: Wed, 11 Mar 2015 09:25:26 -0600 In-Reply-To: <20150311063024.GB29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-2-git-send-email-toshi.kani@hp.com> <20150311063024.GB29788@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: "akpm@linux-foundation.org" , "hpa@zytor.com" , "tglx@linutronix.de" , "mingo@redhat.com" , "arnd@arndb.de" , "linux-mm@kvack.org" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "dave.hansen@intel.com" , "Elliott, Robert (Server Storage)" , "pebolle@tiscali.nl" On Wed, 2015-03-11 at 06:30 +0000, Ingo Molnar wrote: > * Toshi Kani wrote: > > > Documented the return values of KVA mapping functions, > > pud_set_huge(), pmd_set_huge, pud_clear_huge() and > > pmd_clear_huge(). > > > > Simplified the conditions to select HAVE_ARCH_HUGE_VMAP > > in Kconfig since X86_PAE depends on X86_32. > > Changelogs are not a diary, they are a story, generally written in the > present tense. Oh, I see. Thanks for the tip! > So it should be something like: > > Document the return values of KVA mapping functions, > pud_set_huge(), pmd_set_huge, pud_clear_huge() and > pmd_clear_huge(). > > Simplify the conditions to select HAVE_ARCH_HUGE_VMAP > in the Kconfig, since X86_PAE depends on X86_32. > > (also note the slight fixes I made to the text.) Updated with the descriptions above. > > There is no functinal change in this patch. > > Typo. Fixed. > > +/** > > + * pud_set_huge - setup kernel PUD mapping > > + * > > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, > > s/with a/with Fixed. > > + * it does not set up a huge page when the range is covered by non-WB type > > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > > + * > > + * Return 1 on success, and 0 on no-operation. > > What is a 'no-operation'? > > I suspect you want: > > * Returns 1 on success, and 0 when no PUD was set. Yes, that's what it meant to say. > > +/** > > + * pmd_set_huge - setup kernel PMD mapping > > + * > > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, > > + * it does not set up a huge page when the range is covered by non-WB type > > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > > + * > > + * Return 1 on success, and 0 on no-operation. > > Ditto (and the rest of the patch). Updated all functions. I changed pud_clear_huge()'s description to: * Return 1 on success, and 0 when no PUD map was found. Thanks! -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f181.google.com (mail-ob0-f181.google.com [209.85.214.181]) by kanga.kvack.org (Postfix) with ESMTP id A9FD7900049 for ; Wed, 11 Mar 2015 11:27:52 -0400 (EDT) Received: by obcwp4 with SMTP id wp4so9589456obc.4 for ; Wed, 11 Mar 2015 08:27:52 -0700 (PDT) Received: from g4t3426.houston.hp.com (g4t3426.houston.hp.com. [15.201.208.54]) by mx.google.com with ESMTPS id ix8si2263411obc.59.2015.03.11.08.27.52 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Mar 2015 08:27:52 -0700 (PDT) Message-ID: <1426087626.17007.317.camel@misato.fc.hp.com> Subject: Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry From: Toshi Kani Date: Wed, 11 Mar 2015 09:27:06 -0600 In-Reply-To: <20150311063205.GC29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-3-git-send-email-toshi.kani@hp.com> <20150311063205.GC29788@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl On Wed, 2015-03-11 at 07:32 +0100, Ingo Molnar wrote: > * Toshi Kani wrote: > > > When an MTRR entry is inclusive to a requested range, i.e. > > the start and end of the request are not within the MTRR > > entry range but the range contains the MTRR entry entirely, > > __mtrr_type_lookup() ignores such case because both > > start_state and end_state are set to zero. > > 'ignores such a case' or 'ignores such cases'. Changed to 'ignores such a case'. > > This patch fixes the issue by adding a new flag, inclusive, > > s/inclusive/'inclusive' Updated. Thanks! -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by kanga.kvack.org (Postfix) with ESMTP id 1016F900049 for ; Wed, 11 Mar 2015 12:52:56 -0400 (EDT) Received: by obcva8 with SMTP id va8so10177798obc.8 for ; Wed, 11 Mar 2015 09:52:55 -0700 (PDT) Received: from g4t3427.houston.hp.com (g4t3427.houston.hp.com. [15.201.208.55]) by mx.google.com with ESMTPS id e81si2414654oif.34.2015.03.11.09.52.54 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Mar 2015 09:52:55 -0700 (PDT) Message-ID: <1426092728.17007.380.camel@misato.fc.hp.com> Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Wed, 11 Mar 2015 10:52:08 -0600 In-Reply-To: <20150311070216.GD29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> <20150311070216.GD29788@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl On Wed, 2015-03-11 at 08:02 +0100, Ingo Molnar wrote: > * Toshi Kani wrote: > > > This patch adds an additional argument, *uniform, to > > s/*uniform/'uniform' Done. > > mtrr_type_lookup(), which returns 1 when a given range is > > either fully covered by a single MTRR entry or not covered > > at all. > > s/or not covered/or is not covered Done. > > pud_set_huge() and pmd_set_huge() are changed to check the > > new uniform flag to see if it is safe to create a huge page > > s/uniform/'uniform' Done. > > mapping to the range. This allows them to create a huge page > > mapping to a range covered by a single MTRR entry of any > > memory type. It also detects an unoptimal request properly. > > s/unoptimal/non-optimal Done. > or nonoptimal > > Also, some description in the changelog about what a 'non-optimal' > request is would be most userful. > > > They continue to check with the WB type since the WB type has > > no effect even if a request spans to multiple MTRR entries. > > s/spans to/spans Done. > > -static inline u8 mtrr_type_lookup(u64 addr, u64 end) > > +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) > > { > > /* > > * Return no-MTRRs: > > */ > > + *uniform = 1; > > return 0xff; > > } > > #define mtrr_save_fixed_ranges(arg) do {} while (0) > > diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c > > index cdb955f..aef238c 100644 > > --- a/arch/x86/kernel/cpu/mtrr/generic.c > > +++ b/arch/x86/kernel/cpu/mtrr/generic.c > > @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr) > > * *repeat == 1 implies [start:end] spanned across MTRR range and type returned > > * corresponds only to [start:*partial_end]. > > * Caller has to lookup again for [*partial_end:end]. > > + * *uniform == 1 The requested range is either fully covered by a single MTRR > > + * entry or not covered at all. > > */ > > So I think a better approach would be to count the number of separate > MTRR caching types a range is covered by, instead of this hard to > quality 'uniform' flag. > > I.e. a 'nr_mtrr_types' count. > > If for example a range partially intersects with an MTRR, then that > count would be 2: the MTRR, and the outside (default cache policy) > type. > > ( Note that with this approach is not only easy to understand and easy > to review, but could also be refined in the future, to count the > number of _incompatible_ caching types present within a range. ) I agree that using a count is more flexible. However, there are some issues below. - MTRRs have both fixed and variable ranges. The first 1MB is covered with 11 fixed-range registers with different sizes of granularity, 512KB, 128KB, and 32KB. __mtrr_type_lookup() checks the memory type of the range at 'start', but does not check if a requested range spans multiple memory types. This first 1MB can be handled as 'uniform = 0' since processors do not create a huge page map in this 1MB range. However, setting a correct value to 'nr_mtrr_types' requires a major overhaul in this code. - mtrr_type_lookup() returns without walking through all MTRR entries when check_type_overlap() returns 1, i.e. the overlap made the resulted memory type UC. In this case, the code cannot set a correct value to 'nr_mtrr_type'. Since MTRRs are legacy, esp. the fixed range, there is not much benefit from enhancing the functionality of mtrr_type_lookup() unless there is an issue with the current platforms. For this patch, we only need to know whether the mapping count is 1 or >1. So, I think using 'uniform' makes sense for simplicity. > > -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > +static u8 __mtrr_type_lookup(u64 start, u64 end, > > + u64 *partial_end, int *repeat, u8 *uniform) > > { > > int i; > > u64 base, mask; > > u8 prev_match, curr_match; > > > > *repeat = 0; > > + *uniform = 1; > > + > > if (!mtrr_state_set) > > return 0xFF; > > > > @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > /* Look in fixed ranges. Just return the type as per start */ > > if (mtrr_state.have_fixed && (start < 0x100000)) { > > int idx; > > + *uniform = 0; > > So this function scares me, because the code is clearly crap: > > if (mtrr_state.have_fixed && (start < 0x100000)) { > ... > } else if (start < 0x1000000) { > ... > > How can that 'else if' branch ever not be true? This 'else if' is always true. So, it can be simply 'else' without any condition. > Did it perhaps want to be the other way around: > > if (mtrr_state.have_fixed && (start < 0x1000000)) { > ... > } else if (start < 0x100000) { > ... > > or did it simply mess up the condition? I think it was just paranoid to test the same condition twice... > > > > if (start < 0x80000) { > > idx = 0; > > @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > > > end = *partial_end - 1; /* end is inclusive */ > > *repeat = 1; > > + *uniform = 0; > > } > > > > if (!start_state) > > @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > continue; > > } > > > > + *uniform = 0; > > if (check_type_overlap(&prev_match, &curr_match)) > > return curr_match; > > } > > > > @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > } > > > > /* > > - * Returns the effective MTRR type for the region > > + * Returns the effective MTRR type for the region. *uniform is set to 1 > > + * when a given range is either fully covered by a single MTRR entry or > > + * not covered at all. > > + * > > * Error return: > > * 0xFF - when MTRR is not enabled > > */ > > -u8 mtrr_type_lookup(u64 start, u64 end) > > +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) > > { > > - u8 type, prev_type; > > + u8 type, prev_type, is_uniform, dummy; > > int repeat; > > u64 partial_end; > > > > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > > + type = __mtrr_type_lookup(start, end, > > + &partial_end, &repeat, &is_uniform); > > > > /* > > * Common path is with repeat = 0. > > @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) > > while (repeat) { > > prev_type = type; > > start = partial_end; > > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > > + is_uniform = 0; > > > > - if (check_type_overlap(&prev_type, &type)) > > + type = __mtrr_type_lookup(start, end, > > + &partial_end, &repeat, &dummy); > > + > > + if (check_type_overlap(&prev_type, &type)) { > > + *uniform = 0; > > return type; > > + } > > } > > > > + *uniform = is_uniform; > > return type; > > So the MTRR code is from hell, it would be nice to first clean up the > whole code and the MTRR data structures before extending it with more > complexity ... Good idea. I will clean up the code (no functional change) before making this change. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-f177.google.com (mail-we0-f177.google.com [74.125.82.177]) by kanga.kvack.org (Postfix) with ESMTP id 69A5F82905 for ; Thu, 12 Mar 2015 07:03:39 -0400 (EDT) Received: by wevl61 with SMTP id l61so15450074wev.10 for ; Thu, 12 Mar 2015 04:03:38 -0700 (PDT) Received: from mail-wi0-x234.google.com (mail-wi0-x234.google.com. [2a00:1450:400c:c05::234]) by mx.google.com with ESMTPS id d7si10918934wiz.9.2015.03.12.04.03.37 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 12 Mar 2015 04:03:37 -0700 (PDT) Received: by wibbs8 with SMTP id bs8so46672969wib.0 for ; Thu, 12 Mar 2015 04:03:37 -0700 (PDT) Date: Thu, 12 Mar 2015 12:03:33 +0100 From: Ingo Molnar Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Message-ID: <20150312110333.GA6898@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> <20150311070216.GD29788@gmail.com> <1426092728.17007.380.camel@misato.fc.hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426092728.17007.380.camel@misato.fc.hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl * Toshi Kani wrote: > > Did it perhaps want to be the other way around: > > > > if (mtrr_state.have_fixed && (start < 0x1000000)) { > > ... > > } else if (start < 0x100000) { > > ... > > > > or did it simply mess up the condition? > > I think it was just paranoid to test the same condition twice... Read the code again, it's _not_ the same condition ... Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f46.google.com (mail-oi0-f46.google.com [209.85.218.46]) by kanga.kvack.org (Postfix) with ESMTP id 04FB182905 for ; Thu, 12 Mar 2015 09:59:08 -0400 (EDT) Received: by oiav63 with SMTP id v63so10512211oia.7 for ; Thu, 12 Mar 2015 06:59:07 -0700 (PDT) Received: from g4t3427.houston.hp.com (g4t3427.houston.hp.com. [15.201.208.55]) by mx.google.com with ESMTPS id 184si3838077oik.112.2015.03.12.06.59.07 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 12 Mar 2015 06:59:07 -0700 (PDT) Message-ID: <1426168698.17007.385.camel@misato.fc.hp.com> Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Thu, 12 Mar 2015 07:58:18 -0600 In-Reply-To: <20150312110333.GA6898@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> <20150311070216.GD29788@gmail.com> <1426092728.17007.380.camel@misato.fc.hp.com> <20150312110333.GA6898@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: "akpm@linux-foundation.org" , "hpa@zytor.com" , "tglx@linutronix.de" , "mingo@redhat.com" , "arnd@arndb.de" , "linux-mm@kvack.org" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "dave.hansen@intel.com" , "Elliott, Robert (Server Storage)" , "pebolle@tiscali.nl" On Thu, 2015-03-12 at 11:03 +0000, Ingo Molnar wrote: > * Toshi Kani wrote: > > > > Did it perhaps want to be the other way around: > > > > > > if (mtrr_state.have_fixed && (start < 0x1000000)) { > > > ... > > > } else if (start < 0x100000) { > > > ... > > > > > > or did it simply mess up the condition? > > > > I think it was just paranoid to test the same condition twice... > > Read the code again, it's _not_ the same condition ... Oh, I see... It must be a typo. The fixed range is 0x0 to 0xFFFFF, so it only makes sense to check with (start < 0x100000). Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753174AbbCJUYJ (ORCPT ); Tue, 10 Mar 2015 16:24:09 -0400 Received: from g9t5009.houston.hp.com ([15.240.92.67]:43356 "EHLO g9t5009.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751942AbbCJUYG (ORCPT ); Tue, 10 Mar 2015 16:24:06 -0400 From: Toshi Kani To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl Subject: [PATCH 0/3] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Date: Tue, 10 Mar 2015 14:23:14 -0600 Message-Id: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> X-Mailer: git-send-email 1.9.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patchset enhances MTRR checks for the kernel huge I/O mapping, which was enabled by the patchset below: https://lkml.org/lkml/2015/3/3/589 The following functional changes are made in patch 3/3. - Allow pud_set_huge() and pmd_set_huge() to create a huge page mapping to a range covered by a single MTRR entry of any memory type. - Log a pr_warn() message when a requested PMD map range spans more than a single MTRR entry. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Patch 1/3 addresses other review comments to the mapping funcs for better code read-ability. Patch 2/3 fixes a bug in mtrr_type_lookup(). The patchset is based on the -mm tree. --- Toshi Kani (3): 1/3 mm, x86: Document return values of mapping funcs 2/3 mtrr, x86: Fix MTRR lookup to handle inclusive entry 3/3 mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping --- arch/x86/Kconfig | 2 +- arch/x86/include/asm/mtrr.h | 5 ++-- arch/x86/kernel/cpu/mtrr/generic.c | 52 +++++++++++++++++++++++++------------ arch/x86/mm/pat.c | 4 +-- arch/x86/mm/pgtable.c | 53 ++++++++++++++++++++++++++++---------- 5 files changed, 81 insertions(+), 35 deletions(-) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753414AbbCJUYM (ORCPT ); Tue, 10 Mar 2015 16:24:12 -0400 Received: from g1t5424.austin.hp.com ([15.216.225.54]:44575 "EHLO g1t5424.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753243AbbCJUYJ (ORCPT ); Tue, 10 Mar 2015 16:24:09 -0400 From: Toshi Kani To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, Toshi Kani Subject: [PATCH 1/3] mm, x86: Document return values of mapping funcs Date: Tue, 10 Mar 2015 14:23:15 -0600 Message-Id: <1426018997-12936-2-git-send-email-toshi.kani@hp.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Documented the return values of KVA mapping functions, pud_set_huge(), pmd_set_huge, pud_clear_huge() and pmd_clear_huge(). Simplified the conditions to select HAVE_ARCH_HUGE_VMAP in Kconfig since X86_PAE depends on X86_32. There is no functinal change in this patch. Signed-off-by: Toshi Kani --- arch/x86/Kconfig | 2 +- arch/x86/mm/pgtable.c | 36 ++++++++++++++++++++++++++++-------- 2 files changed, 29 insertions(+), 9 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 110f6ae..ba5e78e 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -99,7 +99,7 @@ config X86 select IRQ_FORCED_THREADING select HAVE_BPF_JIT if X86_64 select HAVE_ARCH_TRANSPARENT_HUGEPAGE - select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE) + select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE select ARCH_HAS_SG_CHAIN select CLKEVT_I8253 select ARCH_HAVE_NMI_SAFE_CMPXCHG diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 0b97d2c..a0f7eeb 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -563,14 +563,19 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, } #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP +/** + * pud_set_huge - setup kernel PUD mapping + * + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, + * it does not set up a huge page when the range is covered by non-WB type + * of MTRRs. 0xFF indicates that MTRRs are disabled. + * + * Return 1 on success, and 0 on no-operation. + */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) return 0; @@ -584,14 +589,19 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pmd_set_huge - setup kernel PMD mapping + * + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, + * it does not set up a huge page when the range is covered by non-WB type + * of MTRRs. 0xFF indicates that MTRRs are disabled. + * + * Return 1 on success, and 0 on no-operation. + */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) return 0; @@ -605,6 +615,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pud_clear_huge - clear kernel PUD mapping when it is set + * + * Return 1 on success, and 0 on no-operation. + */ int pud_clear_huge(pud_t *pud) { if (pud_large(*pud)) { @@ -615,6 +630,11 @@ int pud_clear_huge(pud_t *pud) return 0; } +/** + * pmd_clear_huge - clear kernel PMD mapping when it is set + * + * Return 1 on success, and 0 on no-operation. + */ int pmd_clear_huge(pmd_t *pmd) { if (pmd_large(*pmd)) { From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753514AbbCJUYn (ORCPT ); Tue, 10 Mar 2015 16:24:43 -0400 Received: from g1t5424.austin.hp.com ([15.216.225.54]:44591 "EHLO g1t5424.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751942AbbCJUYK (ORCPT ); Tue, 10 Mar 2015 16:24:10 -0400 From: Toshi Kani To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, Toshi Kani Subject: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry Date: Tue, 10 Mar 2015 14:23:16 -0600 Message-Id: <1426018997-12936-3-git-send-email-toshi.kani@hp.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When an MTRR entry is inclusive to a requested range, i.e. the start and end of the request are not within the MTRR entry range but the range contains the MTRR entry entirely, __mtrr_type_lookup() ignores such case because both start_state and end_state are set to zero. This patch fixes the issue by adding a new flag, inclusive, to detect the case. This case is then handled in the same way as (!start_state && end_state). Also updated the comment in __mtrr_type_lookup() to clarify that the repeat handling is necessary to handle overlaps with the default type, since overlaps with multiple entries alone can be handled without such repeat. Signed-off-by: Toshi Kani --- arch/x86/kernel/cpu/mtrr/generic.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 7d74f7b..cdb955f 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) prev_match = 0xFF; for (i = 0; i < num_var_ranges; ++i) { - unsigned short start_state, end_state; + unsigned short start_state, end_state, inclusive; if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11))) continue; @@ -166,20 +166,22 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) start_state = ((start & mask) == (base & mask)); end_state = ((end & mask) == (base & mask)); + inclusive = ((start < base) && (end > base)); - if (start_state != end_state) { + if ((start_state != end_state) || inclusive) { /* * We have start:end spanning across an MTRR. - * We split the region into - * either - * (start:mtrr_end) (mtrr_end:end) - * or - * (start:mtrr_start) (mtrr_start:end) + * We split the region into either + * - start_state:1 + * (start:mtrr_end) (mtrr_end:end) + * - end_state:1 or inclusive:1 + * (start:mtrr_start) (mtrr_start:end) * depending on kind of overlap. * Return the type for first region and a pointer to * the start of second region so that caller will * lookup again on the second region. - * Note: This way we handle multiple overlaps as well. + * Note: This way we handle overlaps with multiple + * entries and the default type properly. */ if (start_state) *partial_end = base + get_mtrr_size(mask); @@ -195,7 +197,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) *repeat = 1; } - if ((start & mask) != (base & mask)) + if (!start_state) continue; curr_match = mtrr_state.var_ranges[i].base_lo & 0xff; From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753484AbbCJUYQ (ORCPT ); Tue, 10 Mar 2015 16:24:16 -0400 Received: from g1t5424.austin.hp.com ([15.216.225.54]:44623 "EHLO g1t5424.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753417AbbCJUYM (ORCPT ); Tue, 10 Mar 2015 16:24:12 -0400 From: Toshi Kani To: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, Toshi Kani Subject: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Tue, 10 Mar 2015 14:23:17 -0600 Message-Id: <1426018997-12936-4-git-send-email-toshi.kani@hp.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch adds an additional argument, *uniform, to mtrr_type_lookup(), which returns 1 when a given range is either fully covered by a single MTRR entry or not covered at all. pud_set_huge() and pmd_set_huge() are changed to check the new uniform flag to see if it is safe to create a huge page mapping to the range. This allows them to create a huge page mapping to a range covered by a single MTRR entry of any memory type. It also detects an unoptimal request properly. They continue to check with the WB type since the WB type has no effect even if a request spans to multiple MTRR entries. pmd_set_huge() logs a warning message to an unoptimal request so that driver writers will be aware of such case. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Signed-off-by: Toshi Kani --- arch/x86/include/asm/mtrr.h | 5 +++-- arch/x86/kernel/cpu/mtrr/generic.c | 32 +++++++++++++++++++++++++------- arch/x86/mm/pat.c | 4 ++-- arch/x86/mm/pgtable.c | 25 +++++++++++++++---------- 4 files changed, 45 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index f768f62..5b4d467 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -31,7 +31,7 @@ * arch_phys_wc_add and arch_phys_wc_del. */ # ifdef CONFIG_MTRR -extern u8 mtrr_type_lookup(u64 addr, u64 end); +extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); extern int mtrr_add(unsigned long base, unsigned long size, @@ -50,11 +50,12 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn); extern int amd_special_default_mtrr(void); extern int phys_wc_to_mtrr_index(int handle); # else -static inline u8 mtrr_type_lookup(u64 addr, u64 end) +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) { /* * Return no-MTRRs: */ + *uniform = 1; return 0xff; } #define mtrr_save_fixed_ranges(arg) do {} while (0) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index cdb955f..aef238c 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr) * *repeat == 1 implies [start:end] spanned across MTRR range and type returned * corresponds only to [start:*partial_end]. * Caller has to lookup again for [*partial_end:end]. + * *uniform == 1 The requested range is either fully covered by a single MTRR + * entry or not covered at all. */ -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) +static u8 __mtrr_type_lookup(u64 start, u64 end, + u64 *partial_end, int *repeat, u8 *uniform) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; + *uniform = 1; + if (!mtrr_state_set) return 0xFF; @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) /* Look in fixed ranges. Just return the type as per start */ if (mtrr_state.have_fixed && (start < 0x100000)) { int idx; + *uniform = 0; if (start < 0x80000) { idx = 0; @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) end = *partial_end - 1; /* end is inclusive */ *repeat = 1; + *uniform = 0; } if (!start_state) @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) continue; } + *uniform = 0; if (check_type_overlap(&prev_match, &curr_match)) return curr_match; } @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) } /* - * Returns the effective MTRR type for the region + * Returns the effective MTRR type for the region. *uniform is set to 1 + * when a given range is either fully covered by a single MTRR entry or + * not covered at all. + * * Error return: * 0xFF - when MTRR is not enabled */ -u8 mtrr_type_lookup(u64 start, u64 end) +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) { - u8 type, prev_type; + u8 type, prev_type, is_uniform, dummy; int repeat; u64 partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + type = __mtrr_type_lookup(start, end, + &partial_end, &repeat, &is_uniform); /* * Common path is with repeat = 0. @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) while (repeat) { prev_type = type; start = partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + is_uniform = 0; - if (check_type_overlap(&prev_type, &type)) + type = __mtrr_type_lookup(start, end, + &partial_end, &repeat, &dummy); + + if (check_type_overlap(&prev_type, &type)) { + *uniform = 0; return type; + } } + *uniform = is_uniform; return type; } diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 35af677..372ad42 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end, * request is for WB. */ if (req_type == _PAGE_CACHE_MODE_WB) { - u8 mtrr_type; + u8 mtrr_type, uniform; - mtrr_type = mtrr_type_lookup(start, end); + mtrr_type = mtrr_type_lookup(start, end, &uniform); if (mtrr_type != MTRR_TYPE_WRBACK) return _PAGE_CACHE_MODE_UC_MINUS; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index a0f7eeb..25843a9 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -567,17 +567,18 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, * pud_set_huge - setup kernel PUD mapping * * MTRRs can override PAT memory types with a 4KB granularity. Therefore, - * it does not set up a huge page when the range is covered by non-WB type - * of MTRRs. 0xFF indicates that MTRRs are disabled. + * it only sets up a huge page when the range is mapped uniformly (i.e. + * either fully covered by a single MTRR entry or not covered at all) or + * the MTRR type is WB. * * Return 1 on success, and 0 on no-operation. */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform); + if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK)) return 0; prot = pgprot_4k_2_large(prot); @@ -593,18 +594,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) * pmd_set_huge - setup kernel PMD mapping * * MTRRs can override PAT memory types with a 4KB granularity. Therefore, - * it does not set up a huge page when the range is covered by non-WB type - * of MTRRs. 0xFF indicates that MTRRs are disabled. + * it only sets up a huge page when the range is mapped uniformly (i.e. + * either fully covered by a single MTRR entry or not covered at all) or + * the MTRR type is WB. * * Return 1 on success, and 0 on no-operation. */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); + if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK)) { + pr_warn("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n", + addr, addr + PMD_SIZE); return 0; + } prot = pgprot_4k_2_large(prot); From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751136AbbCKGab (ORCPT ); Wed, 11 Mar 2015 02:30:31 -0400 Received: from mail-we0-f181.google.com ([74.125.82.181]:44449 "EHLO mail-we0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750798AbbCKGaa (ORCPT ); Wed, 11 Mar 2015 02:30:30 -0400 Date: Wed, 11 Mar 2015 07:30:25 +0100 From: Ingo Molnar To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl Subject: Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs Message-ID: <20150311063024.GB29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-2-git-send-email-toshi.kani@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426018997-12936-2-git-send-email-toshi.kani@hp.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Toshi Kani wrote: > Documented the return values of KVA mapping functions, > pud_set_huge(), pmd_set_huge, pud_clear_huge() and > pmd_clear_huge(). > > Simplified the conditions to select HAVE_ARCH_HUGE_VMAP > in Kconfig since X86_PAE depends on X86_32. Changelogs are not a diary, they are a story, generally written in the present tense. So it should be something like: Document the return values of KVA mapping functions, pud_set_huge(), pmd_set_huge, pud_clear_huge() and pmd_clear_huge(). Simplify the conditions to select HAVE_ARCH_HUGE_VMAP in the Kconfig, since X86_PAE depends on X86_32. (also note the slight fixes I made to the text.) > There is no functinal change in this patch. Typo. > +/** > + * pud_set_huge - setup kernel PUD mapping > + * > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, s/with a/with > + * it does not set up a huge page when the range is covered by non-WB type > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > + * > + * Return 1 on success, and 0 on no-operation. What is a 'no-operation'? I suspect you want: * Returns 1 on success, and 0 when no PUD was set. > +/** > + * pmd_set_huge - setup kernel PMD mapping > + * > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, > + * it does not set up a huge page when the range is covered by non-WB type > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > + * > + * Return 1 on success, and 0 on no-operation. Ditto (and the rest of the patch). Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751356AbbCKGcN (ORCPT ); Wed, 11 Mar 2015 02:32:13 -0400 Received: from mail-wg0-f43.google.com ([74.125.82.43]:44669 "EHLO mail-wg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750755AbbCKGcL (ORCPT ); Wed, 11 Mar 2015 02:32:11 -0400 Date: Wed, 11 Mar 2015 07:32:05 +0100 From: Ingo Molnar To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl Subject: Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry Message-ID: <20150311063205.GC29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-3-git-send-email-toshi.kani@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426018997-12936-3-git-send-email-toshi.kani@hp.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Toshi Kani wrote: > When an MTRR entry is inclusive to a requested range, i.e. > the start and end of the request are not within the MTRR > entry range but the range contains the MTRR entry entirely, > __mtrr_type_lookup() ignores such case because both > start_state and end_state are set to zero. 'ignores such a case' or 'ignores such cases'. > This patch fixes the issue by adding a new flag, inclusive, s/inclusive/'inclusive' Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751452AbbCKHCX (ORCPT ); Wed, 11 Mar 2015 03:02:23 -0400 Received: from mail-wi0-f169.google.com ([209.85.212.169]:41349 "EHLO mail-wi0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750779AbbCKHCW (ORCPT ); Wed, 11 Mar 2015 03:02:22 -0400 Date: Wed, 11 Mar 2015 08:02:16 +0100 From: Ingo Molnar To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Message-ID: <20150311070216.GD29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426018997-12936-4-git-send-email-toshi.kani@hp.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Toshi Kani wrote: > This patch adds an additional argument, *uniform, to s/*uniform/'uniform' > mtrr_type_lookup(), which returns 1 when a given range is > either fully covered by a single MTRR entry or not covered > at all. s/or not covered/or is not covered > pud_set_huge() and pmd_set_huge() are changed to check the > new uniform flag to see if it is safe to create a huge page s/uniform/'uniform' > mapping to the range. This allows them to create a huge page > mapping to a range covered by a single MTRR entry of any > memory type. It also detects an unoptimal request properly. s/unoptimal/non-optimal or nonoptimal Also, some description in the changelog about what a 'non-optimal' request is would be most userful. > They continue to check with the WB type since the WB type has > no effect even if a request spans to multiple MTRR entries. s/spans to/spans > -static inline u8 mtrr_type_lookup(u64 addr, u64 end) > +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) > { > /* > * Return no-MTRRs: > */ > + *uniform = 1; > return 0xff; > } > #define mtrr_save_fixed_ranges(arg) do {} while (0) > diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c > index cdb955f..aef238c 100644 > --- a/arch/x86/kernel/cpu/mtrr/generic.c > +++ b/arch/x86/kernel/cpu/mtrr/generic.c > @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr) > * *repeat == 1 implies [start:end] spanned across MTRR range and type returned > * corresponds only to [start:*partial_end]. > * Caller has to lookup again for [*partial_end:end]. > + * *uniform == 1 The requested range is either fully covered by a single MTRR > + * entry or not covered at all. > */ So I think a better approach would be to count the number of separate MTRR caching types a range is covered by, instead of this hard to quality 'uniform' flag. I.e. a 'nr_mtrr_types' count. If for example a range partially intersects with an MTRR, then that count would be 2: the MTRR, and the outside (default cache policy) type. ( Note that with this approach is not only easy to understand and easy to review, but could also be refined in the future, to count the number of _incompatible_ caching types present within a range. ) > -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > +static u8 __mtrr_type_lookup(u64 start, u64 end, > + u64 *partial_end, int *repeat, u8 *uniform) > { > int i; > u64 base, mask; > u8 prev_match, curr_match; > > *repeat = 0; > + *uniform = 1; > + > if (!mtrr_state_set) > return 0xFF; > > @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > /* Look in fixed ranges. Just return the type as per start */ > if (mtrr_state.have_fixed && (start < 0x100000)) { > int idx; > + *uniform = 0; So this function scares me, because the code is clearly crap: if (mtrr_state.have_fixed && (start < 0x100000)) { ... } else if (start < 0x1000000) { ... How can that 'else if' branch ever not be true? Did it perhaps want to be the other way around: if (mtrr_state.have_fixed && (start < 0x1000000)) { ... } else if (start < 0x100000) { ... or did it simply mess up the condition? > > if (start < 0x80000) { > idx = 0; > @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > end = *partial_end - 1; /* end is inclusive */ > *repeat = 1; > + *uniform = 0; > } > > if (!start_state) > @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > continue; > } > > + *uniform = 0; > if (check_type_overlap(&prev_match, &curr_match)) > return curr_match; > } > @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > } > > /* > - * Returns the effective MTRR type for the region > + * Returns the effective MTRR type for the region. *uniform is set to 1 > + * when a given range is either fully covered by a single MTRR entry or > + * not covered at all. > + * > * Error return: > * 0xFF - when MTRR is not enabled > */ > -u8 mtrr_type_lookup(u64 start, u64 end) > +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) > { > - u8 type, prev_type; > + u8 type, prev_type, is_uniform, dummy; > int repeat; > u64 partial_end; > > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > + type = __mtrr_type_lookup(start, end, > + &partial_end, &repeat, &is_uniform); > > /* > * Common path is with repeat = 0. > @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) > while (repeat) { > prev_type = type; > start = partial_end; > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > + is_uniform = 0; > > - if (check_type_overlap(&prev_type, &type)) > + type = __mtrr_type_lookup(start, end, > + &partial_end, &repeat, &dummy); > + > + if (check_type_overlap(&prev_type, &type)) { > + *uniform = 0; > return type; > + } > } > > + *uniform = is_uniform; > return type; So the MTRR code is from hell, it would be nice to first clean up the whole code and the MTRR data structures before extending it with more complexity ... Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753521AbbCKP0T (ORCPT ); Wed, 11 Mar 2015 11:26:19 -0400 Received: from g4t3426.houston.hp.com ([15.201.208.54]:52362 "EHLO g4t3426.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751395AbbCKP0P (ORCPT ); Wed, 11 Mar 2015 11:26:15 -0400 Message-ID: <1426087526.17007.315.camel@misato.fc.hp.com> Subject: Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs From: Toshi Kani To: Ingo Molnar Cc: "akpm@linux-foundation.org" , "hpa@zytor.com" , "tglx@linutronix.de" , "mingo@redhat.com" , "arnd@arndb.de" , "linux-mm@kvack.org" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "dave.hansen@intel.com" , "Elliott, Robert (Server Storage)" , "pebolle@tiscali.nl" Date: Wed, 11 Mar 2015 09:25:26 -0600 In-Reply-To: <20150311063024.GB29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-2-git-send-email-toshi.kani@hp.com> <20150311063024.GB29788@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4 (3.10.4-4.fc20) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2015-03-11 at 06:30 +0000, Ingo Molnar wrote: > * Toshi Kani wrote: > > > Documented the return values of KVA mapping functions, > > pud_set_huge(), pmd_set_huge, pud_clear_huge() and > > pmd_clear_huge(). > > > > Simplified the conditions to select HAVE_ARCH_HUGE_VMAP > > in Kconfig since X86_PAE depends on X86_32. > > Changelogs are not a diary, they are a story, generally written in the > present tense. Oh, I see. Thanks for the tip! > So it should be something like: > > Document the return values of KVA mapping functions, > pud_set_huge(), pmd_set_huge, pud_clear_huge() and > pmd_clear_huge(). > > Simplify the conditions to select HAVE_ARCH_HUGE_VMAP > in the Kconfig, since X86_PAE depends on X86_32. > > (also note the slight fixes I made to the text.) Updated with the descriptions above. > > There is no functinal change in this patch. > > Typo. Fixed. > > +/** > > + * pud_set_huge - setup kernel PUD mapping > > + * > > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, > > s/with a/with Fixed. > > + * it does not set up a huge page when the range is covered by non-WB type > > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > > + * > > + * Return 1 on success, and 0 on no-operation. > > What is a 'no-operation'? > > I suspect you want: > > * Returns 1 on success, and 0 when no PUD was set. Yes, that's what it meant to say. > > +/** > > + * pmd_set_huge - setup kernel PMD mapping > > + * > > + * MTRRs can override PAT memory types with a 4KB granularity. Therefore, > > + * it does not set up a huge page when the range is covered by non-WB type > > + * of MTRRs. 0xFF indicates that MTRRs are disabled. > > + * > > + * Return 1 on success, and 0 on no-operation. > > Ditto (and the rest of the patch). Updated all functions. I changed pud_clear_huge()'s description to: * Return 1 on success, and 0 when no PUD map was found. Thanks! -Toshi From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753565AbbCKP16 (ORCPT ); Wed, 11 Mar 2015 11:27:58 -0400 Received: from g4t3426.houston.hp.com ([15.201.208.54]:53539 "EHLO g4t3426.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753315AbbCKP1w (ORCPT ); Wed, 11 Mar 2015 11:27:52 -0400 Message-ID: <1426087626.17007.317.camel@misato.fc.hp.com> Subject: Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry From: Toshi Kani To: Ingo Molnar Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl Date: Wed, 11 Mar 2015 09:27:06 -0600 In-Reply-To: <20150311063205.GC29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-3-git-send-email-toshi.kani@hp.com> <20150311063205.GC29788@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4 (3.10.4-4.fc20) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2015-03-11 at 07:32 +0100, Ingo Molnar wrote: > * Toshi Kani wrote: > > > When an MTRR entry is inclusive to a requested range, i.e. > > the start and end of the request are not within the MTRR > > entry range but the range contains the MTRR entry entirely, > > __mtrr_type_lookup() ignores such case because both > > start_state and end_state are set to zero. > > 'ignores such a case' or 'ignores such cases'. Changed to 'ignores such a case'. > > This patch fixes the issue by adding a new flag, inclusive, > > s/inclusive/'inclusive' Updated. Thanks! -Toshi From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752667AbbCKQw6 (ORCPT ); Wed, 11 Mar 2015 12:52:58 -0400 Received: from g4t3427.houston.hp.com ([15.201.208.55]:51976 "EHLO g4t3427.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750839AbbCKQwz (ORCPT ); Wed, 11 Mar 2015 12:52:55 -0400 Message-ID: <1426092728.17007.380.camel@misato.fc.hp.com> Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani To: Ingo Molnar Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl Date: Wed, 11 Mar 2015 10:52:08 -0600 In-Reply-To: <20150311070216.GD29788@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> <20150311070216.GD29788@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4 (3.10.4-4.fc20) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2015-03-11 at 08:02 +0100, Ingo Molnar wrote: > * Toshi Kani wrote: > > > This patch adds an additional argument, *uniform, to > > s/*uniform/'uniform' Done. > > mtrr_type_lookup(), which returns 1 when a given range is > > either fully covered by a single MTRR entry or not covered > > at all. > > s/or not covered/or is not covered Done. > > pud_set_huge() and pmd_set_huge() are changed to check the > > new uniform flag to see if it is safe to create a huge page > > s/uniform/'uniform' Done. > > mapping to the range. This allows them to create a huge page > > mapping to a range covered by a single MTRR entry of any > > memory type. It also detects an unoptimal request properly. > > s/unoptimal/non-optimal Done. > or nonoptimal > > Also, some description in the changelog about what a 'non-optimal' > request is would be most userful. > > > They continue to check with the WB type since the WB type has > > no effect even if a request spans to multiple MTRR entries. > > s/spans to/spans Done. > > -static inline u8 mtrr_type_lookup(u64 addr, u64 end) > > +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) > > { > > /* > > * Return no-MTRRs: > > */ > > + *uniform = 1; > > return 0xff; > > } > > #define mtrr_save_fixed_ranges(arg) do {} while (0) > > diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c > > index cdb955f..aef238c 100644 > > --- a/arch/x86/kernel/cpu/mtrr/generic.c > > +++ b/arch/x86/kernel/cpu/mtrr/generic.c > > @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr) > > * *repeat == 1 implies [start:end] spanned across MTRR range and type returned > > * corresponds only to [start:*partial_end]. > > * Caller has to lookup again for [*partial_end:end]. > > + * *uniform == 1 The requested range is either fully covered by a single MTRR > > + * entry or not covered at all. > > */ > > So I think a better approach would be to count the number of separate > MTRR caching types a range is covered by, instead of this hard to > quality 'uniform' flag. > > I.e. a 'nr_mtrr_types' count. > > If for example a range partially intersects with an MTRR, then that > count would be 2: the MTRR, and the outside (default cache policy) > type. > > ( Note that with this approach is not only easy to understand and easy > to review, but could also be refined in the future, to count the > number of _incompatible_ caching types present within a range. ) I agree that using a count is more flexible. However, there are some issues below. - MTRRs have both fixed and variable ranges. The first 1MB is covered with 11 fixed-range registers with different sizes of granularity, 512KB, 128KB, and 32KB. __mtrr_type_lookup() checks the memory type of the range at 'start', but does not check if a requested range spans multiple memory types. This first 1MB can be handled as 'uniform = 0' since processors do not create a huge page map in this 1MB range. However, setting a correct value to 'nr_mtrr_types' requires a major overhaul in this code. - mtrr_type_lookup() returns without walking through all MTRR entries when check_type_overlap() returns 1, i.e. the overlap made the resulted memory type UC. In this case, the code cannot set a correct value to 'nr_mtrr_type'. Since MTRRs are legacy, esp. the fixed range, there is not much benefit from enhancing the functionality of mtrr_type_lookup() unless there is an issue with the current platforms. For this patch, we only need to know whether the mapping count is 1 or >1. So, I think using 'uniform' makes sense for simplicity. > > -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > +static u8 __mtrr_type_lookup(u64 start, u64 end, > > + u64 *partial_end, int *repeat, u8 *uniform) > > { > > int i; > > u64 base, mask; > > u8 prev_match, curr_match; > > > > *repeat = 0; > > + *uniform = 1; > > + > > if (!mtrr_state_set) > > return 0xFF; > > > > @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > /* Look in fixed ranges. Just return the type as per start */ > > if (mtrr_state.have_fixed && (start < 0x100000)) { > > int idx; > > + *uniform = 0; > > So this function scares me, because the code is clearly crap: > > if (mtrr_state.have_fixed && (start < 0x100000)) { > ... > } else if (start < 0x1000000) { > ... > > How can that 'else if' branch ever not be true? This 'else if' is always true. So, it can be simply 'else' without any condition. > Did it perhaps want to be the other way around: > > if (mtrr_state.have_fixed && (start < 0x1000000)) { > ... > } else if (start < 0x100000) { > ... > > or did it simply mess up the condition? I think it was just paranoid to test the same condition twice... > > > > if (start < 0x80000) { > > idx = 0; > > @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > > > end = *partial_end - 1; /* end is inclusive */ > > *repeat = 1; > > + *uniform = 0; > > } > > > > if (!start_state) > > @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > continue; > > } > > > > + *uniform = 0; > > if (check_type_overlap(&prev_match, &curr_match)) > > return curr_match; > > } > > > > @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) > > } > > > > /* > > - * Returns the effective MTRR type for the region > > + * Returns the effective MTRR type for the region. *uniform is set to 1 > > + * when a given range is either fully covered by a single MTRR entry or > > + * not covered at all. > > + * > > * Error return: > > * 0xFF - when MTRR is not enabled > > */ > > -u8 mtrr_type_lookup(u64 start, u64 end) > > +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) > > { > > - u8 type, prev_type; > > + u8 type, prev_type, is_uniform, dummy; > > int repeat; > > u64 partial_end; > > > > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > > + type = __mtrr_type_lookup(start, end, > > + &partial_end, &repeat, &is_uniform); > > > > /* > > * Common path is with repeat = 0. > > @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) > > while (repeat) { > > prev_type = type; > > start = partial_end; > > - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); > > + is_uniform = 0; > > > > - if (check_type_overlap(&prev_type, &type)) > > + type = __mtrr_type_lookup(start, end, > > + &partial_end, &repeat, &dummy); > > + > > + if (check_type_overlap(&prev_type, &type)) { > > + *uniform = 0; > > return type; > > + } > > } > > > > + *uniform = is_uniform; > > return type; > > So the MTRR code is from hell, it would be nice to first clean up the > whole code and the MTRR data structures before extending it with more > complexity ... Good idea. I will clean up the code (no functional change) before making this change. Thanks, -Toshi From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753068AbbCLLDk (ORCPT ); Thu, 12 Mar 2015 07:03:40 -0400 Received: from mail-we0-f172.google.com ([74.125.82.172]:37109 "EHLO mail-we0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750973AbbCLLDi (ORCPT ); Thu, 12 Mar 2015 07:03:38 -0400 Date: Thu, 12 Mar 2015 12:03:33 +0100 From: Ingo Molnar To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, arnd@arndb.de, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Message-ID: <20150312110333.GA6898@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> <20150311070216.GD29788@gmail.com> <1426092728.17007.380.camel@misato.fc.hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1426092728.17007.380.camel@misato.fc.hp.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Toshi Kani wrote: > > Did it perhaps want to be the other way around: > > > > if (mtrr_state.have_fixed && (start < 0x1000000)) { > > ... > > } else if (start < 0x100000) { > > ... > > > > or did it simply mess up the condition? > > I think it was just paranoid to test the same condition twice... Read the code again, it's _not_ the same condition ... Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754809AbbCLN7L (ORCPT ); Thu, 12 Mar 2015 09:59:11 -0400 Received: from g4t3427.houston.hp.com ([15.201.208.55]:48579 "EHLO g4t3427.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754063AbbCLN7H (ORCPT ); Thu, 12 Mar 2015 09:59:07 -0400 Message-ID: <1426168698.17007.385.camel@misato.fc.hp.com> Subject: Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani To: Ingo Molnar Cc: "akpm@linux-foundation.org" , "hpa@zytor.com" , "tglx@linutronix.de" , "mingo@redhat.com" , "arnd@arndb.de" , "linux-mm@kvack.org" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "dave.hansen@intel.com" , "Elliott, Robert (Server Storage)" , "pebolle@tiscali.nl" Date: Thu, 12 Mar 2015 07:58:18 -0600 In-Reply-To: <20150312110333.GA6898@gmail.com> References: <1426018997-12936-1-git-send-email-toshi.kani@hp.com> <1426018997-12936-4-git-send-email-toshi.kani@hp.com> <20150311070216.GD29788@gmail.com> <1426092728.17007.380.camel@misato.fc.hp.com> <20150312110333.GA6898@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4 (3.10.4-4.fc20) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2015-03-12 at 11:03 +0000, Ingo Molnar wrote: > * Toshi Kani wrote: > > > > Did it perhaps want to be the other way around: > > > > > > if (mtrr_state.have_fixed && (start < 0x1000000)) { > > > ... > > > } else if (start < 0x100000) { > > > ... > > > > > > or did it simply mess up the condition? > > > > I think it was just paranoid to test the same condition twice... > > Read the code again, it's _not_ the same condition ... Oh, I see... It must be a typo. The fixed range is 0x0 to 0xFFFFF, so it only makes sense to check with (start < 0x100000). Thanks, -Toshi