* [PATCH 0/2] KPTI: get rid of cpu_entry_area mapping duplication
@ 2017-12-02 6:20 Andy Lutomirski
2017-12-02 6:20 ` [PATCH 1/2] Undo the split of setup_cpu_entry_area Andy Lutomirski
2017-12-02 6:20 ` [PATCH 2/2] x86/kpti: Reference all cpu_entry_area pagetables in the usermode tables Andy Lutomirski
0 siblings, 2 replies; 8+ messages in thread
From: Andy Lutomirski @ 2017-12-02 6:20 UTC (permalink / raw)
To: X86 ML
Cc: Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst,
Dave Hansen, Linus Torvalds, Josh Poimboeuf, Andy Lutomirski
I like this variant much better. It might also fix the nasty bug tglx
and peterz were chasing.
Andy Lutomirski (2):
Undo the split of setup_cpu_entry_area
x86/kpti: Reference all cpu_entry_area pagetables in the usermode
tables
arch/x86/include/asm/fixmap.h | 14 +++++---
arch/x86/include/asm/kpti.h | 8 +++--
arch/x86/kernel/cpu/common.c | 3 --
arch/x86/kernel/smpboot.c | 2 ++
arch/x86/kernel/traps.c | 6 +++-
arch/x86/mm/kpti.c | 82 ++++++++++++++++++++++++++-----------------
6 files changed, 71 insertions(+), 44 deletions(-)
--
2.13.6
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 1/2] Undo the split of setup_cpu_entry_area 2017-12-02 6:20 [PATCH 0/2] KPTI: get rid of cpu_entry_area mapping duplication Andy Lutomirski @ 2017-12-02 6:20 ` Andy Lutomirski 2017-12-02 10:34 ` Thomas Gleixner 2017-12-02 6:20 ` [PATCH 2/2] x86/kpti: Reference all cpu_entry_area pagetables in the usermode tables Andy Lutomirski 1 sibling, 1 reply; 8+ messages in thread From: Andy Lutomirski @ 2017-12-02 6:20 UTC (permalink / raw) To: X86 ML Cc: Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst, Dave Hansen, Linus Torvalds, Josh Poimboeuf, Andy Lutomirski This is obviously a hack. Either the patch should be adjusted back to the version I sent or trap_init should forcibly initialize all PMDs by something like __set_fixmap(..., __mkpte(0)); or however it's spelled. --- arch/x86/kernel/smpboot.c | 2 ++ arch/x86/kernel/traps.c | 6 +++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 26317716559d..1325844b930a 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1232,8 +1232,10 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) * The boot CPU area has been set up in trap_init() * already. */ + /* if (i) setup_cpu_entry_area(i); + */ } /* diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index ff4e6b595ae4..0ad92f35a75b 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -945,8 +945,12 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code) void __init trap_init(void) { + int cpu; /* Init cpu_entry_area before IST entries are set up */ - setup_cpu_entry_area(smp_processor_id()); + for_each_possible_cpu(cpu) { + pr_err("setup_cpu_entry_area(%d)\n", cpu); + setup_cpu_entry_area(cpu); + } idt_setup_traps(); -- 2.13.6 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] Undo the split of setup_cpu_entry_area 2017-12-02 6:20 ` [PATCH 1/2] Undo the split of setup_cpu_entry_area Andy Lutomirski @ 2017-12-02 10:34 ` Thomas Gleixner 2017-12-02 10:47 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2017-12-02 10:34 UTC (permalink / raw) To: Andy Lutomirski Cc: X86 ML, Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst, Dave Hansen, Linus Torvalds, Josh Poimboeuf On Fri, 1 Dec 2017, Andy Lutomirski wrote: > This is obviously a hack. Either the patch should be adjusted back to > the version I sent or trap_init should forcibly initialize all PMDs > by something like __set_fixmap(..., __mkpte(0)); or however it's spelled. I split it because the whole thing crashed when I kept the loop you had because it tried to allocate stuff. Had no time to figure out why, so I went the lazy way of making it "work". Simplifying the whole mess was on my list anyway. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] Undo the split of setup_cpu_entry_area 2017-12-02 10:34 ` Thomas Gleixner @ 2017-12-02 10:47 ` Thomas Gleixner 2017-12-02 13:34 ` Andy Lutomirski 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2017-12-02 10:47 UTC (permalink / raw) To: Andy Lutomirski Cc: X86 ML, Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst, Dave Hansen, Linus Torvalds, Josh Poimboeuf On Sat, 2 Dec 2017, Thomas Gleixner wrote: > On Fri, 1 Dec 2017, Andy Lutomirski wrote: > > > This is obviously a hack. Either the patch should be adjusted back to > > the version I sent or trap_init should forcibly initialize all PMDs > > by something like __set_fixmap(..., __mkpte(0)); or however it's spelled. > > I split it because the whole thing crashed when I kept the loop you had > because it tried to allocate stuff. Had no time to figure out why, so I > went the lazy way of making it "work". [ 0.000000] dump_stack+0x85/0xc5 [ 0.000000] warn_alloc+0x114/0x1c0 [ 0.000000] __alloc_pages_slowpath+0x1089/0x10d0 [ 0.000000] __alloc_pages_nodemask+0x2e8/0x370 [ 0.000000] __get_free_pages+0x10/0x40 [ 0.000000] kpti_shadow_pagetable_walk+0x2b2/0x3e0 [ 0.000000] kpti_add_user_map+0xfe/0x330 [ 0.000000] kpti_add_mapping_cpu_entry+0x5a/0x100 [ 0.000000] trap_init+0x2c/0x7b [ 0.000000] start_kernel+0x24c/0x497 [ 0.000000] secondary_startup_64+0xa5/0xb0 Cute, isn't it? And then further down the line it triplefaults of course. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] Undo the split of setup_cpu_entry_area 2017-12-02 10:47 ` Thomas Gleixner @ 2017-12-02 13:34 ` Andy Lutomirski 2017-12-02 14:20 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Andy Lutomirski @ 2017-12-02 13:34 UTC (permalink / raw) To: Thomas Gleixner Cc: Andy Lutomirski, X86 ML, Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst, Dave Hansen, Linus Torvalds, Josh Poimboeuf > On Dec 2, 2017, at 2:47 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > >> On Sat, 2 Dec 2017, Thomas Gleixner wrote: >> >>> On Fri, 1 Dec 2017, Andy Lutomirski wrote: >>> >>> This is obviously a hack. Either the patch should be adjusted back to >>> the version I sent or trap_init should forcibly initialize all PMDs >>> by something like __set_fixmap(..., __mkpte(0)); or however it's spelled. >> >> I split it because the whole thing crashed when I kept the loop you had >> because it tried to allocate stuff. Had no time to figure out why, so I >> went the lazy way of making it "work". > > [ 0.000000] dump_stack+0x85/0xc5 > [ 0.000000] warn_alloc+0x114/0x1c0 > [ 0.000000] __alloc_pages_slowpath+0x1089/0x10d0 > [ 0.000000] __alloc_pages_nodemask+0x2e8/0x370 > [ 0.000000] __get_free_pages+0x10/0x40 > [ 0.000000] kpti_shadow_pagetable_walk+0x2b2/0x3e0 > [ 0.000000] kpti_add_user_map+0xfe/0x330 > [ 0.000000] kpti_add_mapping_cpu_entry+0x5a/0x100 > [ 0.000000] trap_init+0x2c/0x7b > [ 0.000000] start_kernel+0x24c/0x497 > [ 0.000000] secondary_startup_64+0xa5/0xb0 > > Cute, isn't it? And then further down the line it triplefaults of course. Hmm. My second patch should make this go away, though. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] Undo the split of setup_cpu_entry_area 2017-12-02 13:34 ` Andy Lutomirski @ 2017-12-02 14:20 ` Thomas Gleixner 0 siblings, 0 replies; 8+ messages in thread From: Thomas Gleixner @ 2017-12-02 14:20 UTC (permalink / raw) To: Andy Lutomirski Cc: Andy Lutomirski, X86 ML, Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst, Dave Hansen, Linus Torvalds, Josh Poimboeuf On Sat, 2 Dec 2017, Andy Lutomirski wrote: > > > > On Dec 2, 2017, at 2:47 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > > > >> On Sat, 2 Dec 2017, Thomas Gleixner wrote: > >> > >>> On Fri, 1 Dec 2017, Andy Lutomirski wrote: > >>> > >>> This is obviously a hack. Either the patch should be adjusted back to > >>> the version I sent or trap_init should forcibly initialize all PMDs > >>> by something like __set_fixmap(..., __mkpte(0)); or however it's spelled. > >> > >> I split it because the whole thing crashed when I kept the loop you had > >> because it tried to allocate stuff. Had no time to figure out why, so I > >> went the lazy way of making it "work". > > > > [ 0.000000] dump_stack+0x85/0xc5 > > [ 0.000000] warn_alloc+0x114/0x1c0 > > [ 0.000000] __alloc_pages_slowpath+0x1089/0x10d0 > > [ 0.000000] __alloc_pages_nodemask+0x2e8/0x370 > > [ 0.000000] __get_free_pages+0x10/0x40 > > [ 0.000000] kpti_shadow_pagetable_walk+0x2b2/0x3e0 > > [ 0.000000] kpti_add_user_map+0xfe/0x330 > > [ 0.000000] kpti_add_mapping_cpu_entry+0x5a/0x100 > > [ 0.000000] trap_init+0x2c/0x7b > > [ 0.000000] start_kernel+0x24c/0x497 > > [ 0.000000] secondary_startup_64+0xa5/0xb0 > > > > Cute, isn't it? And then further down the line it triplefaults of course. > > Hmm. My second patch should make this go away, though. Yes, it's all a mess and I'm cleaning it up. Fighting with the PEBS/BTS buffers right now. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] x86/kpti: Reference all cpu_entry_area pagetables in the usermode tables 2017-12-02 6:20 [PATCH 0/2] KPTI: get rid of cpu_entry_area mapping duplication Andy Lutomirski 2017-12-02 6:20 ` [PATCH 1/2] Undo the split of setup_cpu_entry_area Andy Lutomirski @ 2017-12-02 6:20 ` Andy Lutomirski 2017-12-02 10:38 ` Thomas Gleixner 1 sibling, 1 reply; 8+ messages in thread From: Andy Lutomirski @ 2017-12-02 6:20 UTC (permalink / raw) To: X86 ML Cc: Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst, Dave Hansen, Linus Torvalds, Josh Poimboeuf, Andy Lutomirski We were manually configuring cpu_entry_area in the usermode tables. This was error-prone and wasted memory. (Not much memory, but still.) Instead, just reference the same pagetables. This avoids needing to keep the KPTI code and the normal cpu_entry_area code in sync, since the KPTI code no longer cares what's in cpu_entry_area. [This does *not* work on the current KPTI series. It requires that all the kernelmode cpu_entry_tables are pre-allocated. That happens in the series as I submitted it, but tglx changed it for reasons that I haven't figured out.] Signed-off-by: Andy Lutomirski <luto@kernel.org> --- arch/x86/include/asm/fixmap.h | 14 +++++--- arch/x86/include/asm/kpti.h | 8 +++-- arch/x86/kernel/cpu/common.c | 3 -- arch/x86/mm/kpti.c | 82 ++++++++++++++++++++++++++----------------- 4 files changed, 64 insertions(+), 43 deletions(-) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 839addd1eaec..a630cd2861f7 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -142,16 +142,20 @@ enum fixed_addresses { #ifdef CONFIG_PARAVIRT FIX_PARAVIRT_BOOTMAP, #endif - FIX_TEXT_POKE1, /* reserve 2 pages for text_poke() */ - FIX_TEXT_POKE0, /* first page is last, because allocation is backward */ #ifdef CONFIG_X86_INTEL_MID FIX_LNW_VRTC, #endif - /* Fixmap entries to remap the GDTs, one per processor. */ - FIX_CPU_ENTRY_AREA_TOP, + FIX_TEXT_POKE1, /* reserve 2 pages for text_poke() */ + FIX_TEXT_POKE0, /* first page is last, because allocation is backward */ + + /* + * Fixmap entries to remap the GDTs, one per processor. Align + * to a PMD boundary. + */ + FIX_CPU_ENTRY_AREA_TOP = round_up(FIX_TEXT_POKE0 + 1, PTRS_PER_PMD), FIX_CPU_ENTRY_AREA_BOTTOM = FIX_CPU_ENTRY_AREA_TOP + (CPU_ENTRY_AREA_PAGES * NR_CPUS) - 1, - __end_of_permanent_fixed_addresses, + __end_of_permanent_fixed_addresses = round_up(FIX_CPU_ENTRY_AREA_BOTTOM + 1, PTRS_PER_PMD), /* * 512 temporary boot-time mappings, used by early_ioremap(), diff --git a/arch/x86/include/asm/kpti.h b/arch/x86/include/asm/kpti.h index 0c10e86ae3f8..df52cec2a53b 100644 --- a/arch/x86/include/asm/kpti.h +++ b/arch/x86/include/asm/kpti.h @@ -1,5 +1,8 @@ #ifndef _ASM_X86_KPTI_H #define _ASM_X86_KPTI_H + +#include <linux/init.h> + /* * Copyright(c) 2017 Intel Corporation. All rights reserved. * @@ -34,10 +37,9 @@ extern int kpti_add_mapping(unsigned long addr, unsigned long size, unsigned long flags); /** - * kpti_add_mapping_cpu_entry - map the cpu entry area - * @cpu: the CPU for which the entry area is being mapped + * kpti_clone_cpu_entry_areas - clone cpu_entry_areas to the usermode tables */ -extern void kpti_add_mapping_cpu_entry(int cpu); +extern void __init kpti_clone_cpu_entry_areas(void); /** * kpti_remove_mapping - remove a kernel mapping from the userpage tables diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 00697119f983..3dc814519c92 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -606,9 +606,6 @@ void __init setup_cpu_entry_area(int cpu) sizeof(struct debug_store) / PAGE_SIZE, PAGE_KERNEL); #endif - /* CPU 0's mapping is done in kpti_init() */ - if (cpu) - kpti_add_mapping_cpu_entry(cpu); } /* Load the original GDT from the per-cpu structure */ diff --git a/arch/x86/mm/kpti.c b/arch/x86/mm/kpti.c index 52fd833845ba..cd81a7432f49 100644 --- a/arch/x86/mm/kpti.c +++ b/arch/x86/mm/kpti.c @@ -240,7 +240,7 @@ static pmd_t *kpti_shadow_pagetable_walk_pmd(unsigned long address, * Returns a pointer to a PTE on success, or NULL on failure. */ static pte_t *kpti_shadow_pagetable_walk(unsigned long address, - unsigned long flags) + unsigned long flags) { pmd_t *pmd = kpti_shadow_pagetable_walk_pmd(address, flags); pte_t *pte; @@ -401,28 +401,55 @@ static void __init kpti_init_all_pgds(void) WARN_ON(__ret); \ } while (0) -void kpti_add_mapping_cpu_entry(int cpu) +void __init kpti_clone_cpu_entry_areas(void) { - kpti_add_user_map_early(get_cpu_gdt_ro(cpu), PAGE_SIZE, - __PAGE_KERNEL_RO); - - kpti_add_user_map_early(&get_cpu_entry_area(cpu)->tss, - sizeof(get_cpu_entry_area(cpu)->tss), - __PAGE_KERNEL | _PAGE_GLOBAL); - - /* entry stack */ - kpti_add_user_map_early(&get_cpu_entry_area(cpu)->SYSENTER_stack_page, - sizeof(get_cpu_entry_area(cpu)->SYSENTER_stack_page), - __PAGE_KERNEL | _PAGE_GLOBAL); - - /* Entry code, so needs to be EXEC */ - kpti_add_user_map_early(&get_cpu_entry_area(cpu)->entry_trampoline, - sizeof(get_cpu_entry_area(cpu)->entry_trampoline), - __PAGE_KERNEL_RX | _PAGE_GLOBAL); - - kpti_add_user_map_early(&get_cpu_entry_area(cpu)->exception_stacks, - sizeof(get_cpu_entry_area(cpu)->exception_stacks), - __PAGE_KERNEL | _PAGE_GLOBAL); + int cpu; + unsigned long last_pmd_addr = 0; + + /* The top of the cpu_entry_area block is meant to be PMD-aligned. */ + WARN_ON((unsigned long)(get_cpu_entry_area(NR_CPUS-1) + 1) & ~PMD_MASK); + + /* + * Iterate over possible CPUs, not addresses: it's possible that + * NR_CPUs is enough larger than the actual number of possible CPUs + * that we have unpopulated PMDs in the cpu_entry_area range. + */ + for_each_possible_cpu(cpu) { + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd, *target_pmd; + unsigned long addr = + (unsigned long)get_cpu_entry_area(cpu) & PMD_MASK; + + if (addr == last_pmd_addr) + continue; + last_pmd_addr = addr; + + pgd = pgd_offset_k(addr); + if (WARN_ON(pgd_none(*pgd))) + return; + p4d = p4d_offset(pgd, addr); + if (WARN_ON(p4d_none(*p4d))) + return; + pud = pud_offset(p4d, addr); + if (WARN_ON(pud_none(*pud))) + return; + pmd = pmd_offset(pud, addr); + if (WARN_ON(pmd_none(*pmd))) + return; + + target_pmd = kpti_shadow_pagetable_walk_pmd(addr, 0); + if (WARN_ON(!target_pmd)) + return; + + /* + * Copy the PMD. That is, the kernelmode and usermode tables + * will share all last-level page tables containing + * cpu_entry_area mappings. + */ + *target_pmd = *pmd; + } } /* @@ -459,16 +486,7 @@ void __init kpti_init(void) sizeof(gate_desc) * NR_VECTORS, __PAGE_KERNEL_RO | _PAGE_GLOBAL); - /* - * We delay CPU 0's mappings because these structures are created - * before the page allocator is up. Deferring it until here lets - * us use the plain page allocator unconditionally in the page - * table code above. - * - * This is OK because kpti_init() is called long before we ever run - * userspace and need the KERNEL_PAGE_TABLE_ISOLATION mappings. - */ - kpti_add_mapping_cpu_entry(0); + kpti_clone_cpu_entry_areas(); } int kpti_add_mapping(unsigned long addr, unsigned long size, -- 2.13.6 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] x86/kpti: Reference all cpu_entry_area pagetables in the usermode tables 2017-12-02 6:20 ` [PATCH 2/2] x86/kpti: Reference all cpu_entry_area pagetables in the usermode tables Andy Lutomirski @ 2017-12-02 10:38 ` Thomas Gleixner 0 siblings, 0 replies; 8+ messages in thread From: Thomas Gleixner @ 2017-12-02 10:38 UTC (permalink / raw) To: Andy Lutomirski Cc: X86 ML, Borislav Petkov, linux-kernel@vger.kernel.org, Brian Gerst, Dave Hansen, Linus Torvalds, Josh Poimboeuf On Fri, 1 Dec 2017, Andy Lutomirski wrote: > We were manually configuring cpu_entry_area in the usermode tables. > This was error-prone and wasted memory. (Not much memory, but > still.) Instead, just reference the same pagetables. > > This avoids needing to keep the KPTI code and the normal > cpu_entry_area code in sync, since the KPTI code no longer cares > what's in cpu_entry_area. > > [This does *not* work on the current KPTI series. It requires that > all the kernelmode cpu_entry_tables are pre-allocated. That > happens in the series as I submitted it, but tglx changed it for > reasons that I haven't figured out.] As I said it crashed and burned for yet unknown reasons. I'll dig into that. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-12-02 14:20 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-12-02 6:20 [PATCH 0/2] KPTI: get rid of cpu_entry_area mapping duplication Andy Lutomirski 2017-12-02 6:20 ` [PATCH 1/2] Undo the split of setup_cpu_entry_area Andy Lutomirski 2017-12-02 10:34 ` Thomas Gleixner 2017-12-02 10:47 ` Thomas Gleixner 2017-12-02 13:34 ` Andy Lutomirski 2017-12-02 14:20 ` Thomas Gleixner 2017-12-02 6:20 ` [PATCH 2/2] x86/kpti: Reference all cpu_entry_area pagetables in the usermode tables Andy Lutomirski 2017-12-02 10:38 ` Thomas Gleixner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.