* [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page
@ 2019-04-08 19:11 Singh, Brijesh
2019-04-09 8:40 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Singh, Brijesh @ 2019-04-08 19:11 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org, x86@kernel.org
Cc: Singh, Brijesh, Peter Zijlstra, Dave Hansen, Dan Williams,
Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov,
H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas
The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init()
PTE population") triggers the below warning in the SEV guest.
WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386
Call Trace:
kernel_physical_mapping_init+0xce/0x259
early_set_memory_enc_dec+0x10f/0x160
kvm_smp_prepare_boot_cpu+0x71/0x9d
start_kernel+0x1c9/0x50b
secondary_startup_64+0xa4/0xb0
The SEV guest calls kernel_physical_mapping_init() to clear the encryption
mask from an existing mapping. While clearing the encryption mask
kernel_physical_mapping_init() splits the large pages into the smaller.
To split the page, the kernel_physical_mapping_init() allocates a new page
and updates the existing entry. The set_{pud,pmd}_safe triggers warning
when updating the entry with page in the present state. We should use the
set_{pud,pmd} when updating an existing entry with the new entry.
Updating an entry will also requires a TLB flush. Currently the caller
(early_set_memory_enc_dec()) is taking care of issuing the TLB flushes.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Fixes: 0a9fe8ca844d (x86/mm: Validate kernel_physical_mapping_init() ...)
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
---
arch/x86/mm/init_64.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index bccff68e3267..0a26b64a99b9 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -536,7 +536,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot);
spin_lock(&init_mm.page_table_lock);
- pmd_populate_kernel_safe(&init_mm, pmd, pte);
+ pmd_populate_kernel(&init_mm, pmd, pte);
spin_unlock(&init_mm.page_table_lock);
}
update_page_count(PG_LEVEL_2M, pages);
@@ -623,7 +623,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
page_size_mask, prot);
spin_lock(&init_mm.page_table_lock);
- pud_populate_safe(&init_mm, pud, pmd);
+ pud_populate(&init_mm, pud, pmd);
spin_unlock(&init_mm.page_table_lock);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page 2019-04-08 19:11 [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page Singh, Brijesh @ 2019-04-09 8:40 ` Peter Zijlstra 2019-04-09 9:39 ` Peter Zijlstra 0 siblings, 1 reply; 5+ messages in thread From: Peter Zijlstra @ 2019-04-09 8:40 UTC (permalink / raw) To: Singh, Brijesh Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Dave Hansen, Dan Williams, Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov, H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas On Mon, Apr 08, 2019 at 07:11:21PM +0000, Singh, Brijesh wrote: > The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init() > PTE population") triggers the below warning in the SEV guest. > > WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386 > Call Trace: > kernel_physical_mapping_init+0xce/0x259 > early_set_memory_enc_dec+0x10f/0x160 > kvm_smp_prepare_boot_cpu+0x71/0x9d > start_kernel+0x1c9/0x50b > secondary_startup_64+0xa4/0xb0 > > The SEV guest calls kernel_physical_mapping_init() to clear the encryption > mask from an existing mapping. While clearing the encryption mask > kernel_physical_mapping_init() splits the large pages into the smaller. > To split the page, the kernel_physical_mapping_init() allocates a new page > and updates the existing entry. The set_{pud,pmd}_safe triggers warning > when updating the entry with page in the present state. We should use the > set_{pud,pmd} when updating an existing entry with the new entry. > > Updating an entry will also requires a TLB flush. Currently the caller > (early_set_memory_enc_dec()) is taking care of issuing the TLB flushes. I'm not entirely sure I like this, this means all users of kernel_physical_mapping_init() now need to be aware and careful. That said; the alternative is adding an argument to the function and propagating it through the callchain and dynamically switching between _safe and not. Which doesn't sound ideal either. Anybody else got clever ideas? > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> > Fixes: 0a9fe8ca844d (x86/mm: Validate kernel_physical_mapping_init() ...) > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Dave Hansen <dave.hansen@intel.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Borislav Petkov <bp@alien8.de> > Cc: H. Peter Anvin <hpa@zytor.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Tom Lendacky <Thomas.Lendacky@amd.com> > --- > arch/x86/mm/init_64.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index bccff68e3267..0a26b64a99b9 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -536,7 +536,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end, > paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot); > > spin_lock(&init_mm.page_table_lock); > - pmd_populate_kernel_safe(&init_mm, pmd, pte); > + pmd_populate_kernel(&init_mm, pmd, pte); > spin_unlock(&init_mm.page_table_lock); > } > update_page_count(PG_LEVEL_2M, pages); > @@ -623,7 +623,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end, > page_size_mask, prot); > > spin_lock(&init_mm.page_table_lock); > - pud_populate_safe(&init_mm, pud, pmd); > + pud_populate(&init_mm, pud, pmd); > spin_unlock(&init_mm.page_table_lock); > } > > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page 2019-04-09 8:40 ` Peter Zijlstra @ 2019-04-09 9:39 ` Peter Zijlstra 2019-04-09 10:20 ` Borislav Petkov 2019-04-09 19:09 ` Singh, Brijesh 0 siblings, 2 replies; 5+ messages in thread From: Peter Zijlstra @ 2019-04-09 9:39 UTC (permalink / raw) To: Singh, Brijesh Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Dave Hansen, Dan Williams, Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov, H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas On Tue, Apr 09, 2019 at 10:40:31AM +0200, Peter Zijlstra wrote: > On Mon, Apr 08, 2019 at 07:11:21PM +0000, Singh, Brijesh wrote: > > The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init() > > PTE population") triggers the below warning in the SEV guest. > > > > WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386 > > Call Trace: > > kernel_physical_mapping_init+0xce/0x259 > > early_set_memory_enc_dec+0x10f/0x160 > > kvm_smp_prepare_boot_cpu+0x71/0x9d > > start_kernel+0x1c9/0x50b > > secondary_startup_64+0xa4/0xb0 > > > > The SEV guest calls kernel_physical_mapping_init() to clear the encryption > > mask from an existing mapping. While clearing the encryption mask > > kernel_physical_mapping_init() splits the large pages into the smaller. > > To split the page, the kernel_physical_mapping_init() allocates a new page > > and updates the existing entry. The set_{pud,pmd}_safe triggers warning > > when updating the entry with page in the present state. We should use the > > set_{pud,pmd} when updating an existing entry with the new entry. > > > > Updating an entry will also requires a TLB flush. Currently the caller > > (early_set_memory_enc_dec()) is taking care of issuing the TLB flushes. > > I'm not entirely sure I like this, this means all users of > kernel_physical_mapping_init() now need to be aware and careful. > > That said; the alternative is adding an argument to the function and > propagating it through the callchain and dynamically switching between > _safe and not. Which doesn't sound ideal either. > > Anybody else got clever ideas? The more I think about it, I think that is in fact the right thing to do. Rename kernel_physical_mapping_init() to __& and add that flag, thread it down to all the set_{pud,pmd.pte}() thingies. Then add: unsigned long kernel_physical_mapping_init(unsigned long paddr_start, unsigned long paddr_end, unsigned long page_size_mask) { return __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, true); } unsigned long kernel_physical_mapping_change(unsigned long paddr_start, unsigned long paddr_end, unsigned long page_size_mask) { unsigned long last; last = __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, false); __flush_tlb_all(); return last; } Or something along those lines. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page 2019-04-09 9:39 ` Peter Zijlstra @ 2019-04-09 10:20 ` Borislav Petkov 2019-04-09 19:09 ` Singh, Brijesh 1 sibling, 0 replies; 5+ messages in thread From: Borislav Petkov @ 2019-04-09 10:20 UTC (permalink / raw) To: Peter Zijlstra Cc: Singh, Brijesh, linux-kernel@vger.kernel.org, x86@kernel.org, Dave Hansen, Dan Williams, Kirill A . Shutemov, Andy Lutomirski, H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas On Tue, Apr 09, 2019 at 11:39:35AM +0200, Peter Zijlstra wrote: > unsigned long kernel_physical_mapping_change(unsigned long paddr_start, unsigned > long paddr_end, unsigned long page_size_mask) ... and add a comment above it what the "_change" thing is supposed to mean... > unsigned long last; > > last = __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, false); > > __flush_tlb_all(); ... and maybe not do the flushing here because if you look at early_set_memory_enc_dec(), it iterates over a bunch of addresses and when it is done, does the TLB flush once. If you did it here, then you'd flush after each change. Which is costly. So maybe the comment above should also say that _change callers are responsible to flush the TLB after they're done. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page 2019-04-09 9:39 ` Peter Zijlstra 2019-04-09 10:20 ` Borislav Petkov @ 2019-04-09 19:09 ` Singh, Brijesh 1 sibling, 0 replies; 5+ messages in thread From: Singh, Brijesh @ 2019-04-09 19:09 UTC (permalink / raw) To: Peter Zijlstra Cc: Singh, Brijesh, linux-kernel@vger.kernel.org, x86@kernel.org, Dave Hansen, Dan Williams, Kirill A . Shutemov, Andy Lutomirski, Borislav Petkov, H . Peter Anvin, Thomas Gleixner, Lendacky, Thomas On 4/9/19 4:39 AM, Peter Zijlstra wrote: > On Tue, Apr 09, 2019 at 10:40:31AM +0200, Peter Zijlstra wrote: >> On Mon, Apr 08, 2019 at 07:11:21PM +0000, Singh, Brijesh wrote: >>> The following commit 0a9fe8ca844d ("x86/mm: Validate kernel_physical_mapping_init() >>> PTE population") triggers the below warning in the SEV guest. >>> >>> WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgalloc.h:87 phys_pmd_init+0x30d/0x386 >>> Call Trace: >>> kernel_physical_mapping_init+0xce/0x259 >>> early_set_memory_enc_dec+0x10f/0x160 >>> kvm_smp_prepare_boot_cpu+0x71/0x9d >>> start_kernel+0x1c9/0x50b >>> secondary_startup_64+0xa4/0xb0 >>> >>> The SEV guest calls kernel_physical_mapping_init() to clear the encryption >>> mask from an existing mapping. While clearing the encryption mask >>> kernel_physical_mapping_init() splits the large pages into the smaller. >>> To split the page, the kernel_physical_mapping_init() allocates a new page >>> and updates the existing entry. The set_{pud,pmd}_safe triggers warning >>> when updating the entry with page in the present state. We should use the >>> set_{pud,pmd} when updating an existing entry with the new entry. >>> >>> Updating an entry will also requires a TLB flush. Currently the caller >>> (early_set_memory_enc_dec()) is taking care of issuing the TLB flushes. >> >> I'm not entirely sure I like this, this means all users of >> kernel_physical_mapping_init() now need to be aware and careful. >> >> That said; the alternative is adding an argument to the function and >> propagating it through the callchain and dynamically switching between >> _safe and not. Which doesn't sound ideal either. >> >> Anybody else got clever ideas? > > The more I think about it, I think that is in fact the right thing to > do. > > Rename kernel_physical_mapping_init() to __& and add that flag, thread > it down to all the set_{pud,pmd.pte}() thingies. Then add: > > unsigned long kernel_physical_mapping_init(unsigned long paddr_start, unsigned > long paddr_end, unsigned long page_size_mask) > { > return __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, true); > } > > unsigned long kernel_physical_mapping_change(unsigned long paddr_start, unsigned > long paddr_end, unsigned long page_size_mask) > { > unsigned long last; > > last = __kernel_physical_mapping_init(paddr_start, paddr_end, page_size_mask, false); > > __flush_tlb_all(); > > return last; > } > > Or something along those lines. thanks for the comment. So regarding changing all the set_{pud,pmd,pte} thingies, I can go in one of the two options below, please let me know which is preferred. 1) Update all the functions to check for the flag and use the safe vs non-safe variants accordingly. Something similar to this: if (use_safe) set_pmd_safe(...) else set_pmd(..) .... if (use_safe) pmd_populate_kernel_safe(...) else pmd_populate(...) There are multiple places where we need to add the above check. 2) Define new __& variants of set_{pud,pmd,pte} and populate_{pud,..} and update the callers to use the __&. Something like this: #define DEFINE_POPULATE(fname, type1, type2, safe) \ static void inline __##fname(struct mm_struct *mm, \ type1##_t *arg1, type2##_t *arg2, bool safe) \ { \ if (safe) \ fname##_safe(mm, arg1, arg2); \ else fname(mm, arg1, arg2); \ } DEFINE_POPULATE(p4d_populate, p4d, pud, safe) DEFINE_POPULATE(pgd_populate, pgd, p4d, safe) DEFINE_POPULATE(pud_populate, pud, pmd, safe) DEFINE_POPULATE(pmd_populate_kernel, pmd, pte, safe) I prefer #2 because it still keeps the code readable and minimize the changes in the patch file. thanks ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-04-09 19:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-08 19:11 [PATCH] x86: mm: Do not use set_{pud,pmd}_safe when splitting the large page Singh, Brijesh
2019-04-09 8:40 ` Peter Zijlstra
2019-04-09 9:39 ` Peter Zijlstra
2019-04-09 10:20 ` Borislav Petkov
2019-04-09 19:09 ` Singh, Brijesh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox