From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:39808 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752234AbdGCIKg (ORCPT ); Mon, 3 Jul 2017 04:10:36 -0400 Subject: Patch "x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds()" has been added to the 4.9-stable tree To: bhe@redhat.com, akpm@linux-foundation.org, bp@alien8.de, brgerst@gmail.com, dan.j.williams@intel.com, dave.hansen@linux.intel.com, dvlasenk@redhat.com, dyoung@redhat.com, gregkh@linuxfoundation.org, hpa@zytor.com, jinb.park7@gmail.com, jmoyer@redhat.com, jpoimboe@redhat.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@kernel.org, peterz@infradead.org, tglx@linutronix.de, thgarnie@google.com, torvalds@linux-foundation.org, yasu.isimatu@gmail.com, yinghai@kernel.org Cc: , From: Date: Mon, 03 Jul 2017 10:10:35 +0200 Message-ID: <1499069435146251@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: This is a note to let you know that I've just added the patch titled x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds() to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: x86-mm-fix-boot-crash-caused-by-incorrect-loop-count-calculation-in-sync_global_pgds.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >>From fc5f9d5f151c9fff21d3d1d2907b888a5aec3ff7 Mon Sep 17 00:00:00 2001 From: Baoquan He Date: Thu, 4 May 2017 10:25:47 +0800 Subject: x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds() From: Baoquan He commit fc5f9d5f151c9fff21d3d1d2907b888a5aec3ff7 upstream. Jeff Moyer reported that on his system with two memory regions 0~64G and 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR will make the system hang intermittently during boot. While adding 'nokaslr' won't. The back trace is: Oops: 0000 [#1] SMP RIP: memcpy_erms() [ .... ] Call Trace: pmem_rw_page() bdev_read_page() do_mpage_readpage() mpage_readpages() blkdev_readpages() __do_page_cache_readahead() force_page_cache_readahead() page_cache_sync_readahead() generic_file_read_iter() blkdev_read_iter() __vfs_read() vfs_read() SyS_read() entry_SYSCALL_64_fastpath() This crash happens because the for loop count calculation in sync_global_pgds() is not correct. When a mapping area crosses PGD entries, we should calculate the starting address of region which next PGD covers and assign it to next for loop count, but not add PGDIR_SIZE directly. The old code works right only if the mapping area is an exact multiple of PGDIR_SIZE, otherwize the end region could be skipped so that it can't be synchronized to all other processes from kernel PGD init_mm.pgd. In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it makes this area be mapped inside one PGD entry. With KASLR enabled, this area could cross two PGD entries, then the next PGD entry won't be synced to all other processes. That is why we saw empty PGD. Fix it. Reported-by: Jeff Moyer Signed-off-by: Baoquan He Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dan Williams Cc: Dave Hansen Cc: Dave Young Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Jinbum Park Cc: Josh Poimboeuf Cc: Kees Cook Cc: Kirill A. Shutemov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Garnier Cc: Thomas Gleixner Cc: Yasuaki Ishimatsu Cc: Yinghai Lu Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman --- arch/x86/mm/init_64.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -94,10 +94,10 @@ __setup("noexec32=", nonx32_setup); */ void sync_global_pgds(unsigned long start, unsigned long end, int removed) { - unsigned long address; + unsigned long addr; - for (address = start; address <= end; address += PGDIR_SIZE) { - const pgd_t *pgd_ref = pgd_offset_k(address); + for (addr = start; addr <= end; addr = ALIGN(addr + 1, PGDIR_SIZE)) { + const pgd_t *pgd_ref = pgd_offset_k(addr); struct page *page; /* @@ -113,7 +113,7 @@ void sync_global_pgds(unsigned long star pgd_t *pgd; spinlock_t *pgt_lock; - pgd = (pgd_t *)page_address(page) + pgd_index(address); + pgd = (pgd_t *)page_address(page) + pgd_index(addr); /* the pgt_lock only for Xen */ pgt_lock = &pgd_page_get_mm(page)->page_table_lock; spin_lock(pgt_lock); Patches currently in stable-queue which might be from bhe@redhat.com are queue-4.9/x86-mm-fix-boot-crash-caused-by-incorrect-loop-count-calculation-in-sync_global_pgds.patch