From mboxrd@z Thu Jan 1 00:00:00 1970 From: mark.rutland@arm.com (Mark Rutland) Date: Thu, 10 Dec 2015 15:51:11 +0000 Subject: [RFC PATCH 06/20] arm64: mm: place empty_zero_page in bss In-Reply-To: <56699CD8.2010009@arm.com> References: <1449665095-20774-1-git-send-email-mark.rutland@arm.com> <1449665095-20774-7-git-send-email-mark.rutland@arm.com> <20151210141107.GF21134@arm.com> <20151210152957.GD495@leverpostej> <56699CD8.2010009@arm.com> Message-ID: <20151210155110.GK495@leverpostej> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Thu, Dec 10, 2015 at 03:40:08PM +0000, Marc Zyngier wrote: > On 10/12/15 15:29, Mark Rutland wrote: > > On Thu, Dec 10, 2015 at 02:11:08PM +0000, Will Deacon wrote: > >> On Wed, Dec 09, 2015 at 12:44:41PM +0000, Mark Rutland wrote: > >>> Currently the zero page is set up in paging_init, and thus we cannot use > >>> the zero page earlier. We use the zero page as a reserved TTBR value > >>> from which no TLB entries may be allocated (e.g. when uninstalling the > >>> idmap). To enable such usage earlier (as may be required for invasive > >>> changes to the kernel page tables), and to minimise the time that the > >>> idmap is active, we need to be able to use the zero page before > >>> paging_init. > >>> > >>> This patch follows the example set by x86, by allocating the zero page > >>> at compile time, in .bss. This means that the zero page itself is > >>> available immediately upon entry to start_kernel (as we zero .bss before > >>> this), and also means that the zero page takes up no space in the raw > >>> Image binary. The associated struct page is allocated in bootmem_init, > >>> and remains unavailable until this time. > >>> > >>> Outside of arch code, the only users of empty_zero_page assume that the > >>> empty_zero_page symbol refers to the zeroed memory itself, and that > >>> ZERO_PAGE(x) must be used to acquire the associated struct page, > >>> following the example of x86. This patch also brings arm64 inline with > >>> these assumptions. > >>> > >>> Signed-off-by: Mark Rutland > >>> Cc: Ard Biesheuvel > >>> Cc: Catalin Marinas > >>> Cc: Jeremy Linton > >>> Cc: Laura Abbott > >>> Cc: Will Deacon > >>> --- > >>> arch/arm64/include/asm/mmu_context.h | 2 +- > >>> arch/arm64/include/asm/pgtable.h | 4 ++-- > >>> arch/arm64/mm/mmu.c | 9 +-------- > >>> 3 files changed, 4 insertions(+), 11 deletions(-) > >> > >> [...] > >> > >>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > >>> index 304ff23..7559c22 100644 > >>> --- a/arch/arm64/mm/mmu.c > >>> +++ b/arch/arm64/mm/mmu.c > >>> @@ -48,7 +48,7 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS); > >>> * Empty_zero_page is a special page that is used for zero-initialized data > >>> * and COW. > >>> */ > >>> -struct page *empty_zero_page; > >>> +unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss; > >>> EXPORT_SYMBOL(empty_zero_page); > >> > >> I've been looking at this, and it was making me feel uneasy because it's > >> full of junk before the bss is zeroed. Working that through, it's no > >> worse than what we currently have but I then realised that (a) we don't > >> have a dsb after zeroing the zero page (which we need to make sure the > >> zeroes are visible to the page table walker and (b) the zero page is > >> never explicitly cleaned to the PoC. > > > > Ouch; that's scary. > > > >> There may be cases where the zero-page is used to back read-only, > >> non-cacheable mappings (something to do with KVM?), so I'd sleep better > >> if we made sure that it was clean. > > > > From a grep around for uses of ZERO_PAGE, in most places the zero page > > is simply used as an empty buffer for I/O. In these cases it's either > > accessed coherently or goes via the usual machinery for non-coherent DMA > > kicks in. > > > > I don't believe that we usually give userspace the ability to create > > non-cacheable mappings, and I couldn't spot any paths it could do so via > > some driver-specific IOCTL applied to the zero page. > > > > Looking around, kvm_clear_guest_page seemed problematic, but isn't used > > on arm64. I can imagine the zero page being mapped into guests in other > > situations when mirroring the userspace mapping. > > > > Marc, Christoffer, I thought we cleaned pages to the PoC before mapping > > them into a guest? Is that right? Or do we have potential issues there? > > I think we're OK. Looking at __coherent_cache_guest_page (which is > called when transitioning from an invalid to valid mapping), we do flush > things to PoC if the vcpu has its cache disabled (or if we know that the > IPA shouldn't be cached - the whole NOR flash emulation horror story). So we asume the guest never disables the MMU, and always uses consistent attributes for a given IPA (e.g. it doesn't have a Device and Normal Cacheable mapping)? > Does it answer your question? I think so. If those assumptions are true then I agree we're ok. If those aren't we have other problems. Thanks, Mark.