* [patch 00/10] x86/mm/cpa: Improve large page preservation handling
@ 2018-09-07 15:01 Thomas Gleixner
2018-09-07 15:01 ` [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page() Thomas Gleixner
` (9 more replies)
0 siblings, 10 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
Bin reported that try_preserve_large_page() in the page attribute code
consumes an large amount of CPU time. His initial attempts of addressing
this made me look deeper into the code.
The logic in this code is not really intelligent. It requires to check a
large page in 4k steps for conflicts. That's insane as most operations do
not conflict at all.
The code also lacks sanity checks which allow to detect whether the
existing mapping is incorrect vs. the static protections.
Any form of debugging or statistics is missing as well.
The following series addresses this:
- Clean up the code so it becomes extensible
- Provide the ability to check a full range for conflicts
- Add debug output and statistics to quantify the changes and to allow
observation of the mechanism in the future.
- Add a sanity check for existing mappings with a fixup for the 2M case
and a warning for the 1G case. The 2M case is trivial to address, the
1G case requires larger changes and is just warned about for now.
- Avoid conflict checks for operations which clear the PRESENT bit
- Utilize the range checks to detect conflicts in one operation
- Drop the 4k wise checking which turned out to provide no extra large
page preservation in testing. There might be corner cases where a page
would be preserved, but that's overkill for the common cases.
Before:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 540
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 800770
4K pages set-checked: 7668
After:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 538
2M pages sameprot: 466
2M pages preserved: 47
4K pages set-checked: 7668
This gets rid of ~800000 checks whether a particular address is with a
static protection region. Each check tests against 4 different regions,
which adds up to several million instructions.
Thanks,
tglx
---
Kconfig | 8
mm/pageattr.c | 515 +++++++++++++++++++++++++++++++++++++++++++---------------
2 files changed, 391 insertions(+), 132 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page()
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-10 2:03 ` Yang, Bin
2018-09-07 15:01 ` [patch 02/10] x86/mm/cpa: Rework static_protections() Thomas Gleixner
` (8 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Split-try_preserve_large_page--.patch --]
[-- Type: text/plain, Size: 7656 bytes --]
Avoid the extra variable and gotos by splitting the function into the
actual algorithm and a callable function which contains the lock
protection.
Rename it to should_split_large_page() while at it so the return values make
actually sense.
Clean up the code flow, comments and general whitespace damage while at it. No
functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 121 ++++++++++++++++++++++++-------------------------
1 file changed, 60 insertions(+), 61 deletions(-)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -421,18 +421,18 @@ pte_t *lookup_address_in_pgd(pgd_t *pgd,
*/
pte_t *lookup_address(unsigned long address, unsigned int *level)
{
- return lookup_address_in_pgd(pgd_offset_k(address), address, level);
+ return lookup_address_in_pgd(pgd_offset_k(address), address, level);
}
EXPORT_SYMBOL_GPL(lookup_address);
static pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
unsigned int *level)
{
- if (cpa->pgd)
+ if (cpa->pgd)
return lookup_address_in_pgd(cpa->pgd + pgd_index(address),
address, level);
- return lookup_address(address, level);
+ return lookup_address(address, level);
}
/*
@@ -549,27 +549,22 @@ static pgprot_t pgprot_clear_protnone_bi
return prot;
}
-static int
-try_preserve_large_page(pte_t *kpte, unsigned long address,
- struct cpa_data *cpa)
+static int __should_split_large_page(pte_t *kpte, unsigned long address,
+ struct cpa_data *cpa)
{
- unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn, old_pfn;
- pte_t new_pte, old_pte, *tmp;
+ unsigned long numpages, pmask, psize, lpaddr, addr, pfn, old_pfn;
pgprot_t old_prot, new_prot, req_prot;
- int i, do_split = 1;
+ pte_t new_pte, old_pte, *tmp;
enum pg_level level;
+ int i;
- if (cpa->force_split)
- return 1;
-
- spin_lock(&pgd_lock);
/*
* Check for races, another CPU might have split this page
* up already:
*/
tmp = _lookup_address_cpa(cpa, address, &level);
if (tmp != kpte)
- goto out_unlock;
+ return 1;
switch (level) {
case PG_LEVEL_2M:
@@ -581,8 +576,7 @@ try_preserve_large_page(pte_t *kpte, uns
old_pfn = pud_pfn(*(pud_t *)kpte);
break;
default:
- do_split = -EINVAL;
- goto out_unlock;
+ return -EINVAL;
}
psize = page_level_size(level);
@@ -592,8 +586,8 @@ try_preserve_large_page(pte_t *kpte, uns
* Calculate the number of pages, which fit into this large
* page starting at address:
*/
- nextpage_addr = (address + psize) & pmask;
- numpages = (nextpage_addr - address) >> PAGE_SHIFT;
+ lpaddr = (address + psize) & pmask;
+ numpages = (lpaddr - address) >> PAGE_SHIFT;
if (numpages < cpa->numpages)
cpa->numpages = numpages;
@@ -620,57 +614,62 @@ try_preserve_large_page(pte_t *kpte, uns
pgprot_val(req_prot) |= _PAGE_PSE;
/*
- * old_pfn points to the large page base pfn. So we need
- * to add the offset of the virtual address:
+ * old_pfn points to the large page base pfn. So we need to add the
+ * offset of the virtual address:
*/
pfn = old_pfn + ((address & (psize - 1)) >> PAGE_SHIFT);
cpa->pfn = pfn;
- new_prot = static_protections(req_prot, address, pfn);
+ /*
+ * Calculate the large page base address and the number of 4K pages
+ * in the large page
+ */
+ lpaddr = address & pmask;
+ numpages = psize >> PAGE_SHIFT;
/*
- * We need to check the full range, whether
- * static_protection() requires a different pgprot for one of
- * the pages in the range we try to preserve:
+ * Make sure that the requested pgprot does not violate the static
+ * protections. Check the full large page whether one of the pages
+ * in it results in a different pgprot than the first one of the
+ * requested range. If yes, then the page needs to be split.
*/
- addr = address & pmask;
+ new_prot = static_protections(req_prot, address, pfn, 1);
pfn = old_pfn;
- for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) {
+ for (i = 0, addr = lpaddr; i < numpages; i++, addr += PAGE_SIZE, pfn++) {
pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
if (pgprot_val(chk_prot) != pgprot_val(new_prot))
- goto out_unlock;
+ return 1;
}
- /*
- * If there are no changes, return. maxpages has been updated
- * above:
- */
- if (pgprot_val(new_prot) == pgprot_val(old_prot)) {
- do_split = 0;
- goto out_unlock;
- }
+ /* If there are no changes, return. */
+ if (pgprot_val(new_prot) == pgprot_val(old_prot))
+ return 0;
/*
- * We need to change the attributes. Check, whether we can
- * change the large page in one go. We request a split, when
- * the address is not aligned and the number of pages is
- * smaller than the number of pages in the large page. Note
- * that we limited the number of possible pages already to
- * the number of pages in the large page.
+ * Verify that the address is aligned and the number of pages
+ * covers the full page.
*/
- if (address == (address & pmask) && cpa->numpages == (psize >> PAGE_SHIFT)) {
- /*
- * The address is aligned and the number of pages
- * covers the full page.
- */
- new_pte = pfn_pte(old_pfn, new_prot);
- __set_pmd_pte(kpte, address, new_pte);
- cpa->flags |= CPA_FLUSHTLB;
- do_split = 0;
- }
+ if (address != lpaddr || cpa->numpages != numpages)
+ return 1;
+
+ /* All checks passed. Update the large page mapping. */
+ new_pte = pfn_pte(old_pfn, new_prot);
+ __set_pmd_pte(kpte, address, new_pte);
+ cpa->flags |= CPA_FLUSHTLB;
+ return 0;
+}
-out_unlock:
+static int should_split_large_page(pte_t *kpte, unsigned long address,
+ struct cpa_data *cpa)
+{
+ int do_split;
+
+ if (cpa->force_split)
+ return 1;
+
+ spin_lock(&pgd_lock);
+ do_split = __should_split_large_page(kpte, address, cpa);
spin_unlock(&pgd_lock);
return do_split;
@@ -1273,7 +1272,7 @@ static int __change_page_attr(struct cpa
* Check, whether we can keep the large page intact
* and just change the pte:
*/
- do_split = try_preserve_large_page(kpte, address, cpa);
+ do_split = should_split_large_page(kpte, address, cpa);
/*
* When the range fits into the existing large page,
* return. cp->numpages and cpa->tlbflush have been updated in
@@ -1288,23 +1287,23 @@ static int __change_page_attr(struct cpa
err = split_large_page(cpa, kpte, address);
if (!err) {
/*
- * Do a global flush tlb after splitting the large page
- * and before we do the actual change page attribute in the PTE.
- *
- * With out this, we violate the TLB application note, that says
- * "The TLBs may contain both ordinary and large-page
+ * Do a global flush tlb after splitting the large page
+ * and before we do the actual change page attribute in the PTE.
+ *
+ * With out this, we violate the TLB application note, that says
+ * "The TLBs may contain both ordinary and large-page
* translations for a 4-KByte range of linear addresses. This
* may occur if software modifies the paging structures so that
* the page size used for the address range changes. If the two
* translations differ with respect to page frame or attributes
* (e.g., permissions), processor behavior is undefined and may
* be implementation-specific."
- *
- * We do this global tlb flush inside the cpa_lock, so that we
+ *
+ * We do this global tlb flush inside the cpa_lock, so that we
* don't allow any other cpu, with stale tlb entries change the
* page attribute in parallel, that also falls into the
* just split large page entry.
- */
+ */
flush_tlb_all();
goto repeat;
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 02/10] x86/mm/cpa: Rework static_protections()
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
2018-09-07 15:01 ` [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page() Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 16:21 ` Dave Hansen
2018-09-07 15:01 ` [patch 03/10] x86/mm/cpa: Allow range check for static protections Thomas Gleixner
` (7 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Rework-static_protections--.patch --]
[-- Type: text/plain, Size: 6924 bytes --]
static_protections() is pretty unreadable. Split it up into separate checks
for each protection area.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 159 +++++++++++++++++++++++++++++--------------------
1 file changed, 95 insertions(+), 64 deletions(-)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -286,84 +286,115 @@ static void cpa_flush_array(unsigned lon
}
}
+#ifdef CONFIG_PCI_BIOS
/*
- * Certain areas of memory on x86 require very specific protection flags,
- * for example the BIOS area or kernel text. Callers don't always get this
- * right (again, ioremap() on BIOS memory is not uncommon) so this function
- * checks and fixes these known static required protection bits.
+ * The BIOS area between 640k and 1Mb needs to be executable for PCI BIOS
+ * based config access (CONFIG_PCI_GOBIOS) support.
*/
-static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
- unsigned long pfn)
-{
- pgprot_t forbidden = __pgprot(0);
+#define BIOS_PFN PFN_DOWN(BIOS_BEGIN)
+#define BIOS_PFN_END PFN_DOWN(BIOS_END)
- /*
- * The BIOS area between 640k and 1Mb needs to be executable for
- * PCI BIOS based config access (CONFIG_PCI_GOBIOS) support.
- */
-#ifdef CONFIG_PCI_BIOS
- if (pcibios_enabled && within(pfn, BIOS_BEGIN >> PAGE_SHIFT, BIOS_END >> PAGE_SHIFT))
- pgprot_val(forbidden) |= _PAGE_NX;
+static pgprotval_t protect_pci_bios(unsigned long pfn)
+{
+ if (pcibios_enabled && within(pfn, BIOS_PFN, BIOS_PFN_END))
+ return _PAGE_NX;
+ return 0;
+}
+#else
+static pgprotval_t protect_pci_bios(unsigned long pfn)
+{
+ return 0;
+}
#endif
- /*
- * The kernel text needs to be executable for obvious reasons
- * Does not cover __inittext since that is gone later on. On
- * 64bit we do not enforce !NX on the low mapping
- */
- if (within(address, (unsigned long)_text, (unsigned long)_etext))
- pgprot_val(forbidden) |= _PAGE_NX;
+/*
+ * The .rodata section needs to be read-only. Using the pfn catches all
+ * aliases. This also includes __ro_after_init, so do not enforce until
+ * kernel_set_to_readonly is true.
+ */
+static pgprotval_t protect_rodata(unsigned long pfn)
+{
+ unsigned long start_pfn = __pa_symbol(__start_rodata) >> PAGE_SHIFT;
+ unsigned long end_pfn = __pa_symbol(__end_rodata) >> PAGE_SHIFT;
- /*
- * The .rodata section needs to be read-only. Using the pfn
- * catches all aliases. This also includes __ro_after_init,
- * so do not enforce until kernel_set_to_readonly is true.
- */
- if (kernel_set_to_readonly &&
- within(pfn, __pa_symbol(__start_rodata) >> PAGE_SHIFT,
- __pa_symbol(__end_rodata) >> PAGE_SHIFT))
- pgprot_val(forbidden) |= _PAGE_RW;
+ if (kernel_set_to_readonly && within(pfn, start_pfn, end_pfn))
+ return _PAGE_RW;
+ return 0;
+}
+
+/*
+ * The kernel text needs to be executable for obvious reasons. This does
+ * not cover __inittext since that is gone after boot. On 64bit we do not
+ * enforce !NX on the low mapping
+ */
+static pgprotval_t protect_kernel_text(unsigned long address)
+{
+ if (within(address, (unsigned long)_text, (unsigned long)_etext))
+ return _PAGE_NX;
+ return 0;
+}
#if defined(CONFIG_X86_64)
+/*
+ * Once the kernel maps the text as RO (kernel_set_to_readonly is set),
+ * kernel text mappings for the large page aligned text, rodata sections
+ * will be always read-only. For the kernel identity mappings covering the
+ * holes caused by this alignment can be anything that user asks.
+ *
+ * This will preserve the large page mappings for kernel text/data at no
+ * extra cost.
+ */
+static pgprotval_t protect_kernel_text_ro(unsigned long address)
+{
+ unsigned long end = (unsigned long)__end_rodata_hpage_align;
+ unsigned long start = (unsigned long)_text;
+ unsigned int level;
+
+ if (!kernel_set_to_readonly || !within(address, start, end))
+ return 0;
/*
- * Once the kernel maps the text as RO (kernel_set_to_readonly is set),
- * kernel text mappings for the large page aligned text, rodata sections
- * will be always read-only. For the kernel identity mappings covering
- * the holes caused by this alignment can be anything that user asks.
+ * Don't enforce the !RW mapping for the kernel text mapping, if
+ * the current mapping is already using small page mapping. No
+ * need to work hard to preserve large page mappings in this case.
*
- * This will preserve the large page mappings for kernel text/data
- * at no extra cost.
+ * This also fixes the Linux Xen paravirt guest boot failure caused
+ * by unexpected read-only mappings for kernel identity
+ * mappings. In this paravirt guest case, the kernel text mapping
+ * and the kernel identity mapping share the same page-table pages,
+ * so the protections for kernel text and identity mappings have to
+ * be the same.
*/
- if (kernel_set_to_readonly &&
- within(address, (unsigned long)_text,
- (unsigned long)__end_rodata_hpage_align)) {
- unsigned int level;
-
- /*
- * Don't enforce the !RW mapping for the kernel text mapping,
- * if the current mapping is already using small page mapping.
- * No need to work hard to preserve large page mappings in this
- * case.
- *
- * This also fixes the Linux Xen paravirt guest boot failure
- * (because of unexpected read-only mappings for kernel identity
- * mappings). In this paravirt guest case, the kernel text
- * mapping and the kernel identity mapping share the same
- * page-table pages. Thus we can't really use different
- * protections for the kernel text and identity mappings. Also,
- * these shared mappings are made of small page mappings.
- * Thus this don't enforce !RW mapping for small page kernel
- * text mapping logic will help Linux Xen parvirt guest boot
- * as well.
- */
- if (lookup_address(address, &level) && (level != PG_LEVEL_4K))
- pgprot_val(forbidden) |= _PAGE_RW;
- }
+ if (lookup_address(address, &level) && (level != PG_LEVEL_4K))
+ return _PAGE_RW;
+ return 0;
+}
+#else
+static pgprotval_t protect_kernel_text_ro(unsigned long address)
+{
+ return 0;
+}
#endif
- prot = __pgprot(pgprot_val(prot) & ~pgprot_val(forbidden));
+/*
+ * Certain areas of memory on x86 require very specific protection flags,
+ * for example the BIOS area or kernel text. Callers don't always get this
+ * right (again, ioremap() on BIOS memory is not uncommon) so this function
+ * checks and fixes these known static required protection bits.
+ */
+static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
+ unsigned long pfn)
+{
+ pgprotval_t forbidden;
+
+ /* Operate on the virtual address */
+ forbidden = protect_kernel_text(address);
+ forbidden |= protect_kernel_text_ro(address);
+
+ /* Check the PFN directly */
+ forbidden |= protect_pci_bios(pfn);
+ forbidden |= protect_rodata(pfn);
- return prot;
+ return __pgprot(pgprot_val(prot) & ~forbidden);
}
/*
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 03/10] x86/mm/cpa: Allow range check for static protections
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
2018-09-07 15:01 ` [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page() Thomas Gleixner
2018-09-07 15:01 ` [patch 02/10] x86/mm/cpa: Rework static_protections() Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 04/10] x86/mm/cpa: Add debug mechanism Thomas Gleixner
` (6 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Allow-range-check-for-static-protections.patch --]
[-- Type: text/plain, Size: 6112 bytes --]
Checking static protections only page by page is slow especially for huge
pages. To allow quick checks over a complete range, add the ability to do
that.
Make the checks inclusive so the ranges can be directly used for debug output
later.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 67 +++++++++++++++++++++++++++++++------------------
1 file changed, 43 insertions(+), 24 deletions(-)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -286,22 +286,29 @@ static void cpa_flush_array(unsigned lon
}
}
+static bool overlaps(unsigned long r1_start, unsigned long r1_end,
+ unsigned long r2_start, unsigned long r2_end)
+{
+ return (r1_start <= r2_end && r1_end >= r2_start) ||
+ (r2_start <= r1_end && r2_end >= r1_start);
+}
+
#ifdef CONFIG_PCI_BIOS
/*
* The BIOS area between 640k and 1Mb needs to be executable for PCI BIOS
* based config access (CONFIG_PCI_GOBIOS) support.
*/
#define BIOS_PFN PFN_DOWN(BIOS_BEGIN)
-#define BIOS_PFN_END PFN_DOWN(BIOS_END)
+#define BIOS_PFN_END PFN_DOWN(BIOS_END - 1)
-static pgprotval_t protect_pci_bios(unsigned long pfn)
+static pgprotval_t protect_pci_bios(unsigned long spfn, unsigned long epfn)
{
- if (pcibios_enabled && within(pfn, BIOS_PFN, BIOS_PFN_END))
+ if (pcibios_enabled && overlaps(spfn, epfn, BIOS_PFN, BIOS_PFN_END))
return _PAGE_NX;
return 0;
}
#else
-static pgprotval_t protect_pci_bios(unsigned long pfn)
+static pgprotval_t protect_pci_bios(unsigned long spfn, unsigned long epfn)
{
return 0;
}
@@ -312,12 +319,17 @@ static pgprotval_t protect_pci_bios(unsi
* aliases. This also includes __ro_after_init, so do not enforce until
* kernel_set_to_readonly is true.
*/
-static pgprotval_t protect_rodata(unsigned long pfn)
+static pgprotval_t protect_rodata(unsigned long spfn, unsigned long epfn)
{
- unsigned long start_pfn = __pa_symbol(__start_rodata) >> PAGE_SHIFT;
- unsigned long end_pfn = __pa_symbol(__end_rodata) >> PAGE_SHIFT;
+ unsigned long epfn_ro, spfn_ro = PFN_DOWN(__pa_symbol(__start_rodata));
+
+ /*
+ * Note: __end_rodata is at page aligned and not inclusive, so
+ * subtract 1 to get the last enforced PFN in the rodata area.
+ */
+ epfn_ro = PFN_DOWN(__pa_symbol(__end_rodata)) - 1;
- if (kernel_set_to_readonly && within(pfn, start_pfn, end_pfn))
+ if (kernel_set_to_readonly && overlaps(spfn, epfn, spfn_ro, epfn_ro))
return _PAGE_RW;
return 0;
}
@@ -327,9 +339,12 @@ static pgprotval_t protect_rodata(unsign
* not cover __inittext since that is gone after boot. On 64bit we do not
* enforce !NX on the low mapping
*/
-static pgprotval_t protect_kernel_text(unsigned long address)
+static pgprotval_t protect_kernel_text(unsigned long start, unsigned long end)
{
- if (within(address, (unsigned long)_text, (unsigned long)_etext))
+ unsigned long t_end = (unsigned long)_etext - 1;
+ unsigned long t_start = (unsigned long)_text;
+
+ if (overlaps(start, end, t_start, t_end))
return _PAGE_NX;
return 0;
}
@@ -344,13 +359,14 @@ static pgprotval_t protect_kernel_text(u
* This will preserve the large page mappings for kernel text/data at no
* extra cost.
*/
-static pgprotval_t protect_kernel_text_ro(unsigned long address)
+static pgprotval_t protect_kernel_text_ro(unsigned long start,
+ unsigned long end)
{
- unsigned long end = (unsigned long)__end_rodata_hpage_align;
- unsigned long start = (unsigned long)_text;
+ unsigned long t_end = (unsigned long)__end_rodata_hpage_align - 1;
+ unsigned long t_start = (unsigned long)_text;
unsigned int level;
- if (!kernel_set_to_readonly || !within(address, start, end))
+ if (!kernel_set_to_readonly || !overlaps(start, end, t_start, t_end))
return 0;
/*
* Don't enforce the !RW mapping for the kernel text mapping, if
@@ -364,12 +380,13 @@ static pgprotval_t protect_kernel_text_r
* so the protections for kernel text and identity mappings have to
* be the same.
*/
- if (lookup_address(address, &level) && (level != PG_LEVEL_4K))
+ if (lookup_address(start, &level) && (level != PG_LEVEL_4K))
return _PAGE_RW;
return 0;
}
#else
-static pgprotval_t protect_kernel_text_ro(unsigned long address)
+static pgprotval_t protect_kernel_text_ro(unsigned long start,
+ unsigned long end)
{
return 0;
}
@@ -381,18 +398,20 @@ static pgprotval_t protect_kernel_text_r
* right (again, ioremap() on BIOS memory is not uncommon) so this function
* checks and fixes these known static required protection bits.
*/
-static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
- unsigned long pfn)
+static inline pgprot_t static_protections(pgprot_t prot, unsigned long start,
+ unsigned long pfn, unsigned long npg)
{
pgprotval_t forbidden;
+ unsigned long end;
/* Operate on the virtual address */
- forbidden = protect_kernel_text(address);
- forbidden |= protect_kernel_text_ro(address);
+ end = start + npg * PAGE_SIZE - 1;
+ forbidden = protect_kernel_text(start, end);
+ forbidden |= protect_kernel_text_ro(start, end);
/* Check the PFN directly */
- forbidden |= protect_pci_bios(pfn);
- forbidden |= protect_rodata(pfn);
+ forbidden |= protect_pci_bios(pfn, pfn + npg - 1);
+ forbidden |= protect_rodata(pfn, pfn + npg - 1);
return __pgprot(pgprot_val(prot) & ~forbidden);
}
@@ -667,7 +686,7 @@ static int __should_split_large_page(pte
new_prot = static_protections(req_prot, address, pfn, 1);
pfn = old_pfn;
for (i = 0, addr = lpaddr; i < numpages; i++, addr += PAGE_SIZE, pfn++) {
- pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
+ pgprot_t chk_prot = static_protections(req_prot, addr, pfn, 1);
if (pgprot_val(chk_prot) != pgprot_val(new_prot))
return 1;
@@ -1277,7 +1296,7 @@ static int __change_page_attr(struct cpa
pgprot_val(new_prot) &= ~pgprot_val(cpa->mask_clr);
pgprot_val(new_prot) |= pgprot_val(cpa->mask_set);
- new_prot = static_protections(new_prot, address, pfn);
+ new_prot = static_protections(new_prot, address, pfn, 1);
new_prot = pgprot_clear_protnone_bits(new_prot);
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 04/10] x86/mm/cpa: Add debug mechanism
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
` (2 preceding siblings ...)
2018-09-07 15:01 ` [patch 03/10] x86/mm/cpa: Allow range check for static protections Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 05/10] x86/mm/cpa: Add large page preservation statistics Thomas Gleixner
` (5 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Add-proper-debug-output.patch --]
[-- Type: text/plain, Size: 4135 bytes --]
The whole static protection magic is silently fixing up anything which is
handed in. That's just wrong. The offending call sites need to be fixed.
Add a debug mechanism which emits a warning if a requested mapping needs to be
fixed up. The DETECT debug mechanism is really not meant to be enabled except
for developers, so limit the output hard to the protection fixups.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 60 +++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 51 insertions(+), 9 deletions(-)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -42,6 +42,13 @@ struct cpa_data {
struct page **pages;
};
+enum cpa_warn {
+ CPA_PROTECT,
+ CPA_DETECT,
+};
+
+static const int cpa_warn_level = CPA_PROTECT;
+
/*
* Serialize cpa() (for !DEBUG_PAGEALLOC which uses large identity mappings)
* using cpa_lock. So that we don't allow any other cpu, with stale large tlb
@@ -392,6 +399,27 @@ static pgprotval_t protect_kernel_text_r
}
#endif
+static inline bool conflicts(pgprot_t prot, pgprotval_t val)
+{
+ return (pgprot_val(prot) & ~val) != pgprot_val(prot);
+}
+
+static inline void check_conflict(int warnlvl, pgprot_t prot, pgprotval_t val,
+ unsigned long start, unsigned long end,
+ unsigned long pfn, const char *txt)
+{
+ static const char *lvltxt[] = {
+ [CPA_PROTECT] = "protect",
+ [CPA_DETECT] = "detect",
+ };
+
+ if (warnlvl > cpa_warn_level || !conflicts(prot, val))
+ return;
+
+ pr_warn("CPA %8s %10s: 0x%016lx - 0x%016lx PFN %lx req %016lx prevent %016lx\n",
+ lvltxt[warnlvl], txt, start, end, pfn, pgprot_val(prot), val);
+}
+
/*
* Certain areas of memory on x86 require very specific protection flags,
* for example the BIOS area or kernel text. Callers don't always get this
@@ -399,19 +427,31 @@ static pgprotval_t protect_kernel_text_r
* checks and fixes these known static required protection bits.
*/
static inline pgprot_t static_protections(pgprot_t prot, unsigned long start,
- unsigned long pfn, unsigned long npg)
+ unsigned long pfn, unsigned long npg,
+ int warnlvl)
{
- pgprotval_t forbidden;
+ pgprotval_t forbidden, res;
unsigned long end;
/* Operate on the virtual address */
end = start + npg * PAGE_SIZE - 1;
- forbidden = protect_kernel_text(start, end);
- forbidden |= protect_kernel_text_ro(start, end);
+
+ res = protect_kernel_text(start, end);
+ check_conflict(warnlvl, prot, res, start, end, pfn, "Text NX");
+ forbidden = res;
+
+ res = protect_kernel_text_ro(start, end);
+ check_conflict(warnlvl, prot, res, start, end, pfn, "Text RO");
+ forbidden |= res;
/* Check the PFN directly */
- forbidden |= protect_pci_bios(pfn, pfn + npg - 1);
- forbidden |= protect_rodata(pfn, pfn + npg - 1);
+ res = protect_pci_bios(pfn, pfn + npg - 1);
+ check_conflict(warnlvl, prot, res, start, end, pfn, "PCIBIOS NX");
+ forbidden |= res;
+
+ res = protect_rodata(pfn, pfn + npg - 1);
+ check_conflict(warnlvl, prot, res, start, end, pfn, "Rodata RO");
+ forbidden |= res;
return __pgprot(pgprot_val(prot) & ~forbidden);
}
@@ -683,10 +723,11 @@ static int __should_split_large_page(pte
* in it results in a different pgprot than the first one of the
* requested range. If yes, then the page needs to be split.
*/
- new_prot = static_protections(req_prot, address, pfn, 1);
+ new_prot = static_protections(req_prot, address, pfn, 1, CPA_DETECT);
pfn = old_pfn;
for (i = 0, addr = lpaddr; i < numpages; i++, addr += PAGE_SIZE, pfn++) {
- pgprot_t chk_prot = static_protections(req_prot, addr, pfn, 1);
+ pgprot_t chk_prot = static_protections(req_prot, addr, pfn, 1,
+ CPA_DETECT);
if (pgprot_val(chk_prot) != pgprot_val(new_prot))
return 1;
@@ -1296,7 +1337,8 @@ static int __change_page_attr(struct cpa
pgprot_val(new_prot) &= ~pgprot_val(cpa->mask_clr);
pgprot_val(new_prot) |= pgprot_val(cpa->mask_set);
- new_prot = static_protections(new_prot, address, pfn, 1);
+ new_prot = static_protections(new_prot, address, pfn, 1,
+ CPA_PROTECT);
new_prot = pgprot_clear_protnone_bits(new_prot);
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 05/10] x86/mm/cpa: Add large page preservation statistics
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
` (3 preceding siblings ...)
2018-09-07 15:01 ` [patch 04/10] x86/mm/cpa: Add debug mechanism Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 06/10] x86/mm/cpa: Avoid static protection checks on unmap Thomas Gleixner
` (4 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Add-large-page-preservation-statistics.patch --]
[-- Type: text/plain, Size: 5338 bytes --]
The large page preservation mechanism is just magic and provides no
information at all. Add optional statistic output in debugfs so the magic can
be evaluated. Defaults is off.
Output:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 540
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 800770
4K pages set-checked: 7668
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/Kconfig | 8 +++
arch/x86/mm/pageattr.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 105 insertions(+), 2 deletions(-)
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1495,6 +1495,14 @@ config X86_DIRECT_GBPAGES
supports them), so don't confuse the user by printing
that we have them enabled.
+config X86_CPA_STATISTICS
+ bool "Enable statistic for Change Page Attribute"
+ depends on DEBUG_FS
+ ---help---
+ Expose statistics about the Change Page Attribute mechanims, which
+ helps to determine the effectivness of preserving large and huge
+ page mappings when mapping protections are changed.
+
config ARCH_HAS_MEM_ENCRYPT
def_bool y
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -101,6 +101,95 @@ void arch_report_meminfo(struct seq_file
static inline void split_page_count(int level) { }
#endif
+#ifdef CONFIG_X86_CPA_STATISTICS
+
+static unsigned long cpa_1g_checked;
+static unsigned long cpa_1g_sameprot;
+static unsigned long cpa_1g_preserved;
+static unsigned long cpa_2m_checked;
+static unsigned long cpa_2m_sameprot;
+static unsigned long cpa_2m_preserved;
+static unsigned long cpa_4k_checked;
+static unsigned long cpa_4k_install;
+
+static inline void cpa_inc_1g_checked(void)
+{
+ cpa_1g_checked++;
+}
+
+static inline void cpa_inc_2m_checked(void)
+{
+ cpa_2m_checked++;
+}
+
+static inline void cpa_inc_4k_checked(void)
+{
+ cpa_4k_checked++;
+}
+
+static inline void cpa_inc_4k_install(void)
+{
+ cpa_4k_install++;
+}
+
+static inline void cpa_inc_lp_sameprot(int level)
+{
+ if (level == PG_LEVEL_1G)
+ cpa_1g_sameprot++;
+ else
+ cpa_2m_sameprot++;
+}
+
+static inline void cpa_inc_lp_preserved(int level)
+{
+ if (level == PG_LEVEL_1G)
+ cpa_1g_preserved++;
+ else
+ cpa_2m_preserved++;
+}
+
+static int cpastats_show(struct seq_file *m, void *p)
+{
+ seq_printf(m, "1G pages checked: %16lu\n", cpa_1g_checked);
+ seq_printf(m, "1G pages sameprot: %16lu\n", cpa_1g_sameprot);
+ seq_printf(m, "1G pages preserved: %16lu\n", cpa_1g_preserved);
+ seq_printf(m, "2M pages checked: %16lu\n", cpa_2m_checked);
+ seq_printf(m, "2M pages sameprot: %16lu\n", cpa_2m_sameprot);
+ seq_printf(m, "2M pages preserved: %16lu\n", cpa_2m_preserved);
+ seq_printf(m, "4K pages checked: %16lu\n", cpa_4k_checked);
+ seq_printf(m, "4K pages set-checked: %16lu\n", cpa_4k_install);
+ return 0;
+}
+
+static int cpastats_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, cpastats_show, NULL);
+}
+
+static const struct file_operations cpastats_fops = {
+ .open = cpastats_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static int __init cpa_stats_init(void)
+{
+ debugfs_create_file("cpa_stats", S_IRUSR, arch_debugfs_dir, NULL,
+ &cpastats_fops);
+ return 0;
+}
+late_initcall(cpa_stats_init);
+#else
+static inline void cpa_inc_1g_checked(void) { }
+static inline void cpa_inc_2m_checked(void) { }
+static inline void cpa_inc_4k_checked(void) { }
+static inline void cpa_inc_4k_install(void) { }
+static inline void cpa_inc_lp_sameprot(int level) { }
+static inline void cpa_inc_lp_preserved(int level) { }
+#endif
+
+
static inline int
within(unsigned long addr, unsigned long start, unsigned long end)
{
@@ -660,10 +749,12 @@ static int __should_split_large_page(pte
case PG_LEVEL_2M:
old_prot = pmd_pgprot(*(pmd_t *)kpte);
old_pfn = pmd_pfn(*(pmd_t *)kpte);
+ cpa_inc_2m_checked();
break;
case PG_LEVEL_1G:
old_prot = pud_pgprot(*(pud_t *)kpte);
old_pfn = pud_pfn(*(pud_t *)kpte);
+ cpa_inc_1g_checked();
break;
default:
return -EINVAL;
@@ -728,14 +819,16 @@ static int __should_split_large_page(pte
for (i = 0, addr = lpaddr; i < numpages; i++, addr += PAGE_SIZE, pfn++) {
pgprot_t chk_prot = static_protections(req_prot, addr, pfn, 1,
CPA_DETECT);
-
+ cpa_inc_4k_checked();
if (pgprot_val(chk_prot) != pgprot_val(new_prot))
return 1;
}
/* If there are no changes, return. */
- if (pgprot_val(new_prot) == pgprot_val(old_prot))
+ if (pgprot_val(new_prot) == pgprot_val(old_prot)) {
+ cpa_inc_lp_sameprot(level);
return 0;
+ }
/*
* Verify that the address is aligned and the number of pages
@@ -748,6 +841,7 @@ static int __should_split_large_page(pte
new_pte = pfn_pte(old_pfn, new_prot);
__set_pmd_pte(kpte, address, new_pte);
cpa->flags |= CPA_FLUSHTLB;
+ cpa_inc_lp_preserved(level);
return 0;
}
@@ -1337,6 +1431,7 @@ static int __change_page_attr(struct cpa
pgprot_val(new_prot) &= ~pgprot_val(cpa->mask_clr);
pgprot_val(new_prot) |= pgprot_val(cpa->mask_set);
+ cpa_inc_4k_install();
new_prot = static_protections(new_prot, address, pfn, 1,
CPA_PROTECT);
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 06/10] x86/mm/cpa: Avoid static protection checks on unmap
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
` (4 preceding siblings ...)
2018-09-07 15:01 ` [patch 05/10] x86/mm/cpa: Add large page preservation statistics Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 07/10] x86/mm/cpa: Add sanity check for existing mappings Thomas Gleixner
` (3 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Avoid-static-protection-checks-on-unmap.patch --]
[-- Type: text/plain, Size: 1326 bytes --]
If the new pgprot has the PRESENT bit cleared, then conflicts vs. RW/NX are
completely irrelevant.
Before:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 540
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 800770
4K pages set-checked: 7668
After:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 540
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 800709
4K pages set-checked: 7668
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -522,6 +522,13 @@ static inline pgprot_t static_protection
pgprotval_t forbidden, res;
unsigned long end;
+ /*
+ * There is no point in checking RW/NX conflicts when the requested
+ * mapping is setting the page !PRESENT.
+ */
+ if (!(pgprot_val(prot) & _PAGE_PRESENT))
+ return prot;
+
/* Operate on the virtual address */
end = start + npg * PAGE_SIZE - 1;
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 07/10] x86/mm/cpa: Add sanity check for existing mappings
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
` (5 preceding siblings ...)
2018-09-07 15:01 ` [patch 06/10] x86/mm/cpa: Avoid static protection checks on unmap Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 08/10] x86/mm/cpa: Optimize same protection check Thomas Gleixner
` (2 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Add-sanity-check-for-existing-mappings.patch --]
[-- Type: text/plain, Size: 5446 bytes --]
With the range check it is possible to do a quick verification that the
current mapping is correct vs. the static protection areas.
In case a incorrect mapping is detected a warning is emitted and the large
page is split up. If the large page is a 2M page, then the split code is
forced to check the static protections for the PTE entries to fix up the
incorrectness. For 1G pages this can't be done easily because that would
require to either find the offending 2M areas before the split or
afterwards. For now just warn about that case and revisit it when reported.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 77 ++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 67 insertions(+), 10 deletions(-)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -37,12 +37,14 @@ struct cpa_data {
unsigned long numpages;
int flags;
unsigned long pfn;
- unsigned force_split : 1;
+ unsigned force_split : 1,
+ force_static_prot : 1;
int curpage;
struct page **pages;
};
enum cpa_warn {
+ CPA_CONFLICT,
CPA_PROTECT,
CPA_DETECT,
};
@@ -498,6 +500,7 @@ static inline void check_conflict(int wa
unsigned long pfn, const char *txt)
{
static const char *lvltxt[] = {
+ [CPA_CONFLICT] = "conflict",
[CPA_PROTECT] = "protect",
[CPA_DETECT] = "detect",
};
@@ -739,7 +742,7 @@ static int __should_split_large_page(pte
struct cpa_data *cpa)
{
unsigned long numpages, pmask, psize, lpaddr, addr, pfn, old_pfn;
- pgprot_t old_prot, new_prot, req_prot;
+ pgprot_t old_prot, new_prot, req_prot, chk_prot;
pte_t new_pte, old_pte, *tmp;
enum pg_level level;
int i;
@@ -816,6 +819,23 @@ static int __should_split_large_page(pte
numpages = psize >> PAGE_SHIFT;
/*
+ * Sanity check that the existing mapping is correct versus the static
+ * protections. static_protections() guards against !PRESENT, so no
+ * extra conditional required here.
+ */
+ chk_prot = static_protections(old_prot, lpaddr, old_pfn, numpages,
+ CPA_CONFLICT);
+
+ if (WARN_ON_ONCE(pgprot_val(chk_prot) != pgprot_val(old_prot))) {
+ /*
+ * Split the large page and tell the split code to
+ * enforce static protections.
+ */
+ cpa->force_static_prot = 1;
+ return 1;
+ }
+
+ /*
* Make sure that the requested pgprot does not violate the static
* protections. Check the full large page whether one of the pages
* in it results in a different pgprot than the first one of the
@@ -824,8 +844,8 @@ static int __should_split_large_page(pte
new_prot = static_protections(req_prot, address, pfn, 1, CPA_DETECT);
pfn = old_pfn;
for (i = 0, addr = lpaddr; i < numpages; i++, addr += PAGE_SIZE, pfn++) {
- pgprot_t chk_prot = static_protections(req_prot, addr, pfn, 1,
- CPA_DETECT);
+ chk_prot = static_protections(req_prot, addr, pfn, 1,
+ CPA_DETECT);
cpa_inc_4k_checked();
if (pgprot_val(chk_prot) != pgprot_val(new_prot))
return 1;
@@ -867,15 +887,50 @@ static int should_split_large_page(pte_t
return do_split;
}
+static void split_set_pte(struct cpa_data *cpa, pte_t *pte, unsigned long pfn,
+ pgprot_t ref_prot, unsigned long address,
+ unsigned long size)
+{
+ unsigned int npg = PFN_DOWN(size);
+ pgprot_t prot;
+
+ /*
+ * If try_preserve_large_page() discovered an inconsistent mapping,
+ * remove the invalid protection in the split mapping.
+ */
+ if (!cpa->force_static_prot)
+ goto set;
+
+ prot = static_protections(ref_prot, address, pfn, npg, CPA_PROTECT);
+
+ if (pgprot_val(prot) == pgprot_val(ref_prot))
+ goto set;
+
+ /*
+ * If this is splitting a PMD, fix it up. PUD splits cannot be
+ * fixed trivially as that would require to rescan the newly
+ * installed PMD mappings after returning from split_large_page()
+ * so an eventual further split can allocate the necessary PTE
+ * pages. Warn for now and revisit it in case this actually
+ * happens.
+ */
+ if (size == PAGE_SIZE)
+ ref_prot = prot;
+ else
+ pr_warn_once("CPA: Cannot fixup static protections for PUD split\n");
+set:
+ set_pte(pte, pfn_pte(pfn, ref_prot));
+}
+
static int
__split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
struct page *base)
{
+ unsigned long lpaddr, lpinc, ref_pfn, pfn, pfninc = 1;
pte_t *pbase = (pte_t *)page_address(base);
- unsigned long ref_pfn, pfn, pfninc = 1;
unsigned int i, level;
- pte_t *tmp;
pgprot_t ref_prot;
+ pte_t *tmp;
spin_lock(&pgd_lock);
/*
@@ -898,15 +953,17 @@ static int
* PAT bit to correct position.
*/
ref_prot = pgprot_large_2_4k(ref_prot);
-
ref_pfn = pmd_pfn(*(pmd_t *)kpte);
+ lpaddr = address & PMD_MASK;
+ lpinc = PAGE_SIZE;
break;
case PG_LEVEL_1G:
ref_prot = pud_pgprot(*(pud_t *)kpte);
ref_pfn = pud_pfn(*(pud_t *)kpte);
pfninc = PMD_PAGE_SIZE >> PAGE_SHIFT;
-
+ lpaddr = address & PUD_MASK;
+ lpinc = PMD_SIZE;
/*
* Clear the PSE flags if the PRESENT flag is not set
* otherwise pmd_present/pmd_huge will return true
@@ -927,8 +984,8 @@ static int
* Get the target pfn from the original entry:
*/
pfn = ref_pfn;
- for (i = 0; i < PTRS_PER_PTE; i++, pfn += pfninc)
- set_pte(&pbase[i], pfn_pte(pfn, ref_prot));
+ for (i = 0; i < PTRS_PER_PTE; i++, pfn += pfninc, lpaddr += lpinc)
+ split_set_pte(cpa, pbase + i, pfn, ref_prot, lpaddr, lpinc);
if (virt_addr_valid(address)) {
unsigned long pfn = PFN_DOWN(__pa(address));
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 08/10] x86/mm/cpa: Optimize same protection check
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
` (6 preceding siblings ...)
2018-09-07 15:01 ` [patch 07/10] x86/mm/cpa: Add sanity check for existing mappings Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 09/10] x86/mm/cpa: Do the range check early Thomas Gleixner
2018-09-07 15:01 ` [patch 10/10] x86/mm/cpa: Avoid the 4k pages check completely Thomas Gleixner
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Optimize-large-page-preservation-check.patch --]
[-- Type: text/plain, Size: 1959 bytes --]
When the existing mapping is correct and the new requested page protections
are the same as the existing ones, then further checks can be omitted and the
large page can be preserved. The slow path 4k wise check will not come up with
a different result.
Before:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 540
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 800709
4K pages set-checked: 7668
After:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 538
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 560642
4K pages set-checked: 7668
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -836,6 +836,20 @@ static int __should_split_large_page(pte
}
/*
+ * Optimization: If the requested pgprot is the same as the current
+ * pgprot, then the large page can be preserved and no updates are
+ * required independent of alignment and length of the requested
+ * range. The above already established that the current pgprot is
+ * correct, which in consequence makes the requested pgprot correct
+ * as well if it is the same. The static protection scan below will
+ * not come to a different conclusion.
+ */
+ if (pgprot_val(req_prot) == pgprot_val(old_prot)) {
+ cpa_inc_lp_sameprot(level);
+ return 0;
+ }
+
+ /*
* Make sure that the requested pgprot does not violate the static
* protections. Check the full large page whether one of the pages
* in it results in a different pgprot than the first one of the
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 09/10] x86/mm/cpa: Do the range check early
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
` (7 preceding siblings ...)
2018-09-07 15:01 ` [patch 08/10] x86/mm/cpa: Optimize same protection check Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 10/10] x86/mm/cpa: Avoid the 4k pages check completely Thomas Gleixner
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Do-the-range-check-early.patch --]
[-- Type: text/plain, Size: 2931 bytes --]
To avoid excessive 4k wise checks in the common case do a quick check first
whether the requested new page protections conflict with a static
protection area in the large page. If there is no conflict then the
decision whether to preserve or to split the page can be made immediately.
If the requested range covers the full large page, preserve it. Otherwise
split it up. No point in doing a slow crawl in 4k steps.
Before:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 538
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 560642
4K pages set-checked: 7668
After:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 541
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 514
4K pages set-checked: 7668
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 27 +++++++++++++++++++++++----
1 file changed, 23 insertions(+), 4 deletions(-)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -850,10 +850,28 @@ static int __should_split_large_page(pte
}
/*
- * Make sure that the requested pgprot does not violate the static
- * protections. Check the full large page whether one of the pages
- * in it results in a different pgprot than the first one of the
- * requested range. If yes, then the page needs to be split.
+ * Optimization: Check whether the requested pgprot is conflicting
+ * with a static protection requirement in the large page. If not,
+ * then checking whether the requested range is fully covering the
+ * large page can be done right here.
+ */
+ new_prot = static_protections(req_prot, lpaddr, old_pfn, numpages,
+ CPA_DETECT);
+
+ if (pgprot_val(req_prot) == pgprot_val(new_prot)) {
+ if (address != lpaddr || cpa->numpages != numpages)
+ return 1;
+ goto setlp;
+ }
+
+ /*
+ * Slow path. The full large page check above established that the
+ * requested pgprot cannot be applied to the full large page due to
+ * conflicting requirements of static protection regions. It might
+ * turn out that the whole requested range is covered by the
+ * modified protection of the first 4k segment at @address. This
+ * might result in the ability to preserve the large page
+ * nevertheless.
*/
new_prot = static_protections(req_prot, address, pfn, 1, CPA_DETECT);
pfn = old_pfn;
@@ -878,6 +896,7 @@ static int __should_split_large_page(pte
if (address != lpaddr || cpa->numpages != numpages)
return 1;
+setlp:
/* All checks passed. Update the large page mapping. */
new_pte = pfn_pte(old_pfn, new_prot);
__set_pmd_pte(kpte, address, new_pte);
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 10/10] x86/mm/cpa: Avoid the 4k pages check completely
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
` (8 preceding siblings ...)
2018-09-07 15:01 ` [patch 09/10] x86/mm/cpa: Do the range check early Thomas Gleixner
@ 2018-09-07 15:01 ` Thomas Gleixner
9 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-07 15:01 UTC (permalink / raw)
To: LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Dave Hansen, Mark Gross
[-- Attachment #1: x86-mm-cpa--Avoid-the-4k-pages-check-completely.patch --]
[-- Type: text/plain, Size: 5632 bytes --]
The extra loop which tries hard to preserve large pages in case of conflicts
with static protection regions turns out to be not preserving anything, at
least not in the experiments which have been conducted.
There might be corner cases in which the code would be able to preserve a
large page oaccsionally, but it's really not worth the extra code and the
cycles wasted in the common case.
Before:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 541
2M pages sameprot: 466
2M pages preserved: 47
4K pages checked: 514
4K pages set-checked: 7668
After:
1G pages checked: 2
1G pages sameprot: 0
1G pages preserved: 0
2M pages checked: 538
2M pages sameprot: 466
2M pages preserved: 47
4K pages set-checked: 7668
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/mm/pageattr.c | 66 ++++++++++++-------------------------------------
1 file changed, 17 insertions(+), 49 deletions(-)
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -111,7 +111,6 @@ static unsigned long cpa_1g_preserved;
static unsigned long cpa_2m_checked;
static unsigned long cpa_2m_sameprot;
static unsigned long cpa_2m_preserved;
-static unsigned long cpa_4k_checked;
static unsigned long cpa_4k_install;
static inline void cpa_inc_1g_checked(void)
@@ -124,11 +123,6 @@ static inline void cpa_inc_2m_checked(vo
cpa_2m_checked++;
}
-static inline void cpa_inc_4k_checked(void)
-{
- cpa_4k_checked++;
-}
-
static inline void cpa_inc_4k_install(void)
{
cpa_4k_install++;
@@ -158,7 +152,6 @@ static int cpastats_show(struct seq_file
seq_printf(m, "2M pages checked: %16lu\n", cpa_2m_checked);
seq_printf(m, "2M pages sameprot: %16lu\n", cpa_2m_sameprot);
seq_printf(m, "2M pages preserved: %16lu\n", cpa_2m_preserved);
- seq_printf(m, "4K pages checked: %16lu\n", cpa_4k_checked);
seq_printf(m, "4K pages set-checked: %16lu\n", cpa_4k_install);
return 0;
}
@@ -185,7 +178,6 @@ late_initcall(cpa_stats_init);
#else
static inline void cpa_inc_1g_checked(void) { }
static inline void cpa_inc_2m_checked(void) { }
-static inline void cpa_inc_4k_checked(void) { }
static inline void cpa_inc_4k_install(void) { }
static inline void cpa_inc_lp_sameprot(int level) { }
static inline void cpa_inc_lp_preserved(int level) { }
@@ -741,11 +733,10 @@ static pgprot_t pgprot_clear_protnone_bi
static int __should_split_large_page(pte_t *kpte, unsigned long address,
struct cpa_data *cpa)
{
- unsigned long numpages, pmask, psize, lpaddr, addr, pfn, old_pfn;
+ unsigned long numpages, pmask, psize, lpaddr, pfn, old_pfn;
pgprot_t old_prot, new_prot, req_prot, chk_prot;
pte_t new_pte, old_pte, *tmp;
enum pg_level level;
- int i;
/*
* Check for races, another CPU might have split this page
@@ -850,53 +841,30 @@ static int __should_split_large_page(pte
}
/*
- * Optimization: Check whether the requested pgprot is conflicting
- * with a static protection requirement in the large page. If not,
- * then checking whether the requested range is fully covering the
- * large page can be done right here.
+ * If the requested range does not cover the full page, split it up
*/
- new_prot = static_protections(req_prot, lpaddr, old_pfn, numpages,
- CPA_DETECT);
-
- if (pgprot_val(req_prot) == pgprot_val(new_prot)) {
- if (address != lpaddr || cpa->numpages != numpages)
- return 1;
- goto setlp;
- }
+ if (address != lpaddr || cpa->numpages != numpages)
+ return 1;
/*
- * Slow path. The full large page check above established that the
- * requested pgprot cannot be applied to the full large page due to
- * conflicting requirements of static protection regions. It might
- * turn out that the whole requested range is covered by the
- * modified protection of the first 4k segment at @address. This
- * might result in the ability to preserve the large page
- * nevertheless.
- */
- new_prot = static_protections(req_prot, address, pfn, 1, CPA_DETECT);
- pfn = old_pfn;
- for (i = 0, addr = lpaddr; i < numpages; i++, addr += PAGE_SIZE, pfn++) {
- chk_prot = static_protections(req_prot, addr, pfn, 1,
- CPA_DETECT);
- cpa_inc_4k_checked();
- if (pgprot_val(chk_prot) != pgprot_val(new_prot))
- return 1;
- }
-
- /* If there are no changes, return. */
- if (pgprot_val(new_prot) == pgprot_val(old_prot)) {
- cpa_inc_lp_sameprot(level);
- return 0;
- }
+ * Check whether the requested pgprot is conflicting with a static
+ * protection requirement in the large page.
+ */
+ new_prot = static_protections(req_prot, lpaddr, old_pfn, numpages,
+ CPA_DETECT);
/*
- * Verify that the address is aligned and the number of pages
- * covers the full page.
+ * If there is a conflict, split the large page.
+ *
+ * There used to be a 4k wise evaluation trying really hard to
+ * preserve the large pages, but experimentation has shown, that this
+ * does not help at all. There might be corner cases which would
+ * preserve one large page occasionally, but it's really not worth the
+ * extra code and cycles for the common case.
*/
- if (address != lpaddr || cpa->numpages != numpages)
+ if (pgprot_val(req_prot) != pgprot_val(new_prot))
return 1;
-setlp:
/* All checks passed. Update the large page mapping. */
new_pte = pfn_pte(old_pfn, new_prot);
__set_pmd_pte(kpte, address, new_pte);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 02/10] x86/mm/cpa: Rework static_protections()
2018-09-07 15:01 ` [patch 02/10] x86/mm/cpa: Rework static_protections() Thomas Gleixner
@ 2018-09-07 16:21 ` Dave Hansen
0 siblings, 0 replies; 14+ messages in thread
From: Dave Hansen @ 2018-09-07 16:21 UTC (permalink / raw)
To: Thomas Gleixner, LKML; +Cc: x86, Peter Zijlstra, Bin Yang, Mark Gross
On 09/07/2018 08:01 AM, Thomas Gleixner wrote:
> +static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
> + unsigned long pfn)
> +{
> + pgprotval_t forbidden;
> +
> + /* Operate on the virtual address */
> + forbidden = protect_kernel_text(address);
> + forbidden |= protect_kernel_text_ro(address);
> +
> + /* Check the PFN directly */
> + forbidden |= protect_pci_bios(pfn);
> + forbidden |= protect_rodata(pfn);
>
> - return prot;
> + return __pgprot(pgprot_val(prot) & ~forbidden);
> }
Wow, that's quite an improvement, especially separating the vaddr vs.
paddr checks.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page()
2018-09-07 15:01 ` [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page() Thomas Gleixner
@ 2018-09-10 2:03 ` Yang, Bin
2018-09-10 7:04 ` Thomas Gleixner
0 siblings, 1 reply; 14+ messages in thread
From: Yang, Bin @ 2018-09-10 2:03 UTC (permalink / raw)
To: tglx@linutronix.de, linux-kernel@vger.kernel.org
Cc: Gross, Mark, peterz@infradead.org, x86@kernel.org, Hansen, Dave
On Fri, 2018-09-07 at 17:01 +0200, Thomas Gleixner wrote:
> Avoid the extra variable and gotos by splitting the function into the
> actual algorithm and a callable function which contains the lock
> protection.
>
> Rename it to should_split_large_page() while at it so the return values make
> actually sense.
>
> Clean up the code flow, comments and general whitespace damage while at it. No
> functional change.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> arch/x86/mm/pageattr.c | 121 ++++++++++++++++++++++++-------------------------
> 1 file changed, 60 insertions(+), 61 deletions(-)
>
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -421,18 +421,18 @@ pte_t *lookup_address_in_pgd(pgd_t *pgd,
> */
> pte_t *lookup_address(unsigned long address, unsigned int *level)
> {
> - return lookup_address_in_pgd(pgd_offset_k(address), address, level);
> + return lookup_address_in_pgd(pgd_offset_k(address), address, level);
> }
> EXPORT_SYMBOL_GPL(lookup_address);
>
> static pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
> unsigned int *level)
> {
> - if (cpa->pgd)
> + if (cpa->pgd)
> return lookup_address_in_pgd(cpa->pgd + pgd_index(address),
> address, level);
>
> - return lookup_address(address, level);
> + return lookup_address(address, level);
> }
>
> /*
> @@ -549,27 +549,22 @@ static pgprot_t pgprot_clear_protnone_bi
> return prot;
> }
>
> -static int
> -try_preserve_large_page(pte_t *kpte, unsigned long address,
> - struct cpa_data *cpa)
> +static int __should_split_large_page(pte_t *kpte, unsigned long address,
> + struct cpa_data *cpa)
> {
> - unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn, old_pfn;
> - pte_t new_pte, old_pte, *tmp;
> + unsigned long numpages, pmask, psize, lpaddr, addr, pfn, old_pfn;
> pgprot_t old_prot, new_prot, req_prot;
> - int i, do_split = 1;
> + pte_t new_pte, old_pte, *tmp;
> enum pg_level level;
> + int i;
>
> - if (cpa->force_split)
> - return 1;
> -
> - spin_lock(&pgd_lock);
> /*
> * Check for races, another CPU might have split this page
> * up already:
> */
> tmp = _lookup_address_cpa(cpa, address, &level);
> if (tmp != kpte)
> - goto out_unlock;
> + return 1;
>
> switch (level) {
> case PG_LEVEL_2M:
> @@ -581,8 +576,7 @@ try_preserve_large_page(pte_t *kpte, uns
> old_pfn = pud_pfn(*(pud_t *)kpte);
> break;
> default:
> - do_split = -EINVAL;
> - goto out_unlock;
> + return -EINVAL;
> }
>
> psize = page_level_size(level);
> @@ -592,8 +586,8 @@ try_preserve_large_page(pte_t *kpte, uns
> * Calculate the number of pages, which fit into this large
> * page starting at address:
> */
> - nextpage_addr = (address + psize) & pmask;
> - numpages = (nextpage_addr - address) >> PAGE_SHIFT;
> + lpaddr = (address + psize) & pmask;
> + numpages = (lpaddr - address) >> PAGE_SHIFT;
> if (numpages < cpa->numpages)
> cpa->numpages = numpages;
>
> @@ -620,57 +614,62 @@ try_preserve_large_page(pte_t *kpte, uns
> pgprot_val(req_prot) |= _PAGE_PSE;
>
> /*
> - * old_pfn points to the large page base pfn. So we need
> - * to add the offset of the virtual address:
> + * old_pfn points to the large page base pfn. So we need to add the
> + * offset of the virtual address:
> */
> pfn = old_pfn + ((address & (psize - 1)) >> PAGE_SHIFT);
> cpa->pfn = pfn;
>
> - new_prot = static_protections(req_prot, address, pfn);
> + /*
> + * Calculate the large page base address and the number of 4K pages
> + * in the large page
> + */
> + lpaddr = address & pmask;
> + numpages = psize >> PAGE_SHIFT;
>
> /*
> - * We need to check the full range, whether
> - * static_protection() requires a different pgprot for one of
> - * the pages in the range we try to preserve:
> + * Make sure that the requested pgprot does not violate the static
> + * protections. Check the full large page whether one of the pages
> + * in it results in a different pgprot than the first one of the
> + * requested range. If yes, then the page needs to be split.
> */
> - addr = address & pmask;
> + new_prot = static_protections(req_prot, address, pfn, 1);
"npg" is introduced by patch #3. It might be better to keep old API in
this patch.
> pfn = old_pfn;
> - for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) {
> + for (i = 0, addr = lpaddr; i < numpages; i++, addr += PAGE_SIZE, pfn++) {
> pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
>
> if (pgprot_val(chk_prot) != pgprot_val(new_prot))
> - goto out_unlock;
> + return 1;
> }
>
> - /*
> - * If there are no changes, return. maxpages has been updated
> - * above:
> - */
> - if (pgprot_val(new_prot) == pgprot_val(old_prot)) {
> - do_split = 0;
> - goto out_unlock;
> - }
> + /* If there are no changes, return. */
> + if (pgprot_val(new_prot) == pgprot_val(old_prot))
> + return 0;
>
> /*
> - * We need to change the attributes. Check, whether we can
> - * change the large page in one go. We request a split, when
> - * the address is not aligned and the number of pages is
> - * smaller than the number of pages in the large page. Note
> - * that we limited the number of possible pages already to
> - * the number of pages in the large page.
> + * Verify that the address is aligned and the number of pages
> + * covers the full page.
> */
> - if (address == (address & pmask) && cpa->numpages == (psize >> PAGE_SHIFT)) {
> - /*
> - * The address is aligned and the number of pages
> - * covers the full page.
> - */
> - new_pte = pfn_pte(old_pfn, new_prot);
> - __set_pmd_pte(kpte, address, new_pte);
> - cpa->flags |= CPA_FLUSHTLB;
> - do_split = 0;
> - }
> + if (address != lpaddr || cpa->numpages != numpages)
> + return 1;
> +
> + /* All checks passed. Update the large page mapping. */
> + new_pte = pfn_pte(old_pfn, new_prot);
> + __set_pmd_pte(kpte, address, new_pte);
> + cpa->flags |= CPA_FLUSHTLB;
> + return 0;
> +}
>
> -out_unlock:
> +static int should_split_large_page(pte_t *kpte, unsigned long address,
> + struct cpa_data *cpa)
> +{
> + int do_split;
> +
> + if (cpa->force_split)
> + return 1;
> +
> + spin_lock(&pgd_lock);
> + do_split = __should_split_large_page(kpte, address, cpa);
> spin_unlock(&pgd_lock);
>
> return do_split;
> @@ -1273,7 +1272,7 @@ static int __change_page_attr(struct cpa
> * Check, whether we can keep the large page intact
> * and just change the pte:
> */
> - do_split = try_preserve_large_page(kpte, address, cpa);
> + do_split = should_split_large_page(kpte, address, cpa);
> /*
> * When the range fits into the existing large page,
> * return. cp->numpages and cpa->tlbflush have been updated in
> @@ -1288,23 +1287,23 @@ static int __change_page_attr(struct cpa
> err = split_large_page(cpa, kpte, address);
> if (!err) {
> /*
> - * Do a global flush tlb after splitting the large page
> - * and before we do the actual change page attribute in the PTE.
> - *
> - * With out this, we violate the TLB application note, that says
> - * "The TLBs may contain both ordinary and large-page
> + * Do a global flush tlb after splitting the large page
> + * and before we do the actual change page attribute in the PTE.
> + *
> + * With out this, we violate the TLB application note, that says
> + * "The TLBs may contain both ordinary and large-page
> * translations for a 4-KByte range of linear addresses. This
> * may occur if software modifies the paging structures so that
> * the page size used for the address range changes. If the two
> * translations differ with respect to page frame or attributes
> * (e.g., permissions), processor behavior is undefined and may
> * be implementation-specific."
> - *
> - * We do this global tlb flush inside the cpa_lock, so that we
> + *
> + * We do this global tlb flush inside the cpa_lock, so that we
> * don't allow any other cpu, with stale tlb entries change the
> * page attribute in parallel, that also falls into the
> * just split large page entry.
> - */
> + */
> flush_tlb_all();
> goto repeat;
> }
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page()
2018-09-10 2:03 ` Yang, Bin
@ 2018-09-10 7:04 ` Thomas Gleixner
0 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2018-09-10 7:04 UTC (permalink / raw)
To: Yang, Bin
Cc: linux-kernel@vger.kernel.org, Gross, Mark, peterz@infradead.org,
x86@kernel.org, Hansen, Dave
On Mon, 10 Sep 2018, Yang, Bin wrote:
Can you please trim your replies? It's a pain in the neck to find the
single line of information within a large pile of useless quoted text.
> On Fri, 2018-09-07 at 17:01 +0200, Thomas Gleixner wrote:
> > /*
> > - * We need to check the full range, whether
> > - * static_protection() requires a different pgprot for one of
> > - * the pages in the range we try to preserve:
> > + * Make sure that the requested pgprot does not violate the static
> > + * protections. Check the full large page whether one of the pages
> > + * in it results in a different pgprot than the first one of the
> > + * requested range. If yes, then the page needs to be split.
> > */
> > - addr = address & pmask;
> > + new_prot = static_protections(req_prot, address, pfn, 1);
>
> "npg" is introduced by patch #3. It might be better to keep old API in
> this patch.
Yes. That's bogus. Will fix.
Thanks,
tglx
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2018-09-10 7:04 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-07 15:01 [patch 00/10] x86/mm/cpa: Improve large page preservation handling Thomas Gleixner
2018-09-07 15:01 ` [patch 01/10] x86/mm/cpa: Split, rename and clean up try_preserve_large_page() Thomas Gleixner
2018-09-10 2:03 ` Yang, Bin
2018-09-10 7:04 ` Thomas Gleixner
2018-09-07 15:01 ` [patch 02/10] x86/mm/cpa: Rework static_protections() Thomas Gleixner
2018-09-07 16:21 ` Dave Hansen
2018-09-07 15:01 ` [patch 03/10] x86/mm/cpa: Allow range check for static protections Thomas Gleixner
2018-09-07 15:01 ` [patch 04/10] x86/mm/cpa: Add debug mechanism Thomas Gleixner
2018-09-07 15:01 ` [patch 05/10] x86/mm/cpa: Add large page preservation statistics Thomas Gleixner
2018-09-07 15:01 ` [patch 06/10] x86/mm/cpa: Avoid static protection checks on unmap Thomas Gleixner
2018-09-07 15:01 ` [patch 07/10] x86/mm/cpa: Add sanity check for existing mappings Thomas Gleixner
2018-09-07 15:01 ` [patch 08/10] x86/mm/cpa: Optimize same protection check Thomas Gleixner
2018-09-07 15:01 ` [patch 09/10] x86/mm/cpa: Do the range check early Thomas Gleixner
2018-09-07 15:01 ` [patch 10/10] x86/mm/cpa: Avoid the 4k pages check completely Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).