From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15E5BEE6424 for ; Fri, 15 Sep 2023 07:02:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=1Q9PapBEou9tlouuJav+fC6Ry0i/wbL8At8gGAu5IRA=; b=3Ca4TvCcJ9QnJC LxnSNJoM7QAO11Ttz4azJ+76l+T88O7Tmw/rEtbFL7z6uRx45op5FozPuifiTEppzNeE1fPyiQlQg 5CO+46VmDHEs9F0CRqbtqHXTBjdso0hEHw1iXU1CrK563MZH8w6bP9GHYYGSd0SPDnCIXmykdOFb7 q4NoJdWK/H0MLAQ/TD9Lk3JerLOGBNOpoCfg/tJ+PJ5SMRvacSRQsCPVXMNn0s3vVVrmxFKfzwNNd dyVDKedntfcLrP7bE4GRaIeH8DV63ddFv0sHkgUAAEsaU5sah67kh4TdTMDbfNGI056PQNzg218BU qPikg+JAHMw4aXdVmgDg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qh2qk-009yaN-1l; Fri, 15 Sep 2023 07:02:46 +0000 Received: from mgamail.intel.com ([134.134.136.31]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qh2qh-009yYs-1B for kexec@lists.infradead.org; Fri, 15 Sep 2023 07:02:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694761363; x=1726297363; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=s0kZ/AIc+rsmCS2PWVznIgM2tf3amxnRp/LSICst2bU=; b=a5f0OwelDlY2/xqu0cQxZ0vzF636RHVVEhAOPXMfUDt4A6HQ6td3jnSE IgADgrZ3PJt2sxhEGMQww/bDUrB5wMh1j5Ml2+w3Vp7zBDCWxuPS1T0Hs LKhCvBaco/xdN1ODYKKcALxbx7CBTfYGW+V30eDmeTltx3+oxUn5y4Hpl YCarsXQ9bUyr6YHduFnIBXU6vZzGN/ETIftnRcCDYMPPq8d8DeW+ij1cm KRBWkMwFbFtnrIMY2FlcL/Oe85mirabRhdvuQRKbAO1dtmDiMlOdlEpqf pjKPwYs912/vszs3vooUIjoMTnUZ7M2kExatDYPsRoiTykwahRBexU0R6 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="443244757" X-IronPort-AV: E=Sophos;i="6.02,148,1688454000"; d="scan'208";a="443244757" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 00:02:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="835091524" X-IronPort-AV: E=Sophos;i="6.02,148,1688454000"; d="scan'208";a="835091524" Received: from gcecchi-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.49.15]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2023 00:02:30 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id C7E1F1099B8; Fri, 15 Sep 2023 10:02:27 +0300 (+03) From: "Kirill A. Shutemov" To: dave.hansen@intel.com Cc: kirill.shutemov@linux.intel.com, aaron.lu@intel.com, ardb@google.com, bagasdotme@gmail.com, bp@alien8.de, keescook@chromium.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, regressions@lists.linux.de, tglx@linutronix.de, thomas.lendacky@amd.com, x86@kernel.org Subject: [PATCHv2] x86/boot/compressed: Reserve more memory for page tables Date: Fri, 15 Sep 2023 10:02:21 +0300 Message-ID: <20230915070221.10266-1-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230914170726.4am7xi36m4hdgiyk@box> References: <20230914170726.4am7xi36m4hdgiyk@box> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230915_000243_471600_4A3365CF X-CRM114-Status: GOOD ( 16.54 ) X-BeenThere: kexec@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "kexec" Errors-To: kexec-bounces+kexec=archiver.kernel.org@lists.infradead.org The decompressor has a hard limit on the number of page tables it can allocate. This limit is defined at compile-time and will cause boot failure if it is reached. The kernel is very strict and calculates the limit precisely for the worst-case scenario based on the current configuration. However, it is easy to forget to adjust the limit when a new use-case arises. The worst-case scenario is rarely encountered during sanity checks. In the case of enabling 5-level paging, a use-case was overlooked. The limit needs to be increased by one to accommodate the additional level. This oversight went unnoticed until Aaron attempted to run the kernel via kexec with 5-level paging and unaccepted memory enabled. Update wost-case calculations to include 5-level paging. To address this issue, let's allocate some extra space for page tables. 128K should be sufficient for any use-case. The logic can be simplified by using a single value for all kernel configurations. Signed-off-by: Kirill A. Shutemov Reported-by: Aaron Lu Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage") --- arch/x86/boot/compressed/ident_map_64.c | 8 +++++ arch/x86/include/asm/boot.h | 47 +++++++++++++++++-------- 2 files changed, 40 insertions(+), 15 deletions(-) diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index bcc956c17872..08f93b0401bb 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -59,6 +59,14 @@ static void *alloc_pgt_page(void *context) return NULL; } + /* Consumed more tables than expected? */ + if (pages->pgt_buf_offset == BOOT_PGT_SIZE_WARN) { + debug_putstr("pgt_buf running low in " __FILE__ "\n"); + debug_putstr("Need to raise BOOT_PGT_SIZE?\n"); + debug_putaddr(pages->pgt_buf_offset); + debug_putaddr(pages->pgt_buf_size); + } + entry = pages->pgt_buf + pages->pgt_buf_offset; pages->pgt_buf_offset += PAGE_SIZE; diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h index 9191280d9ea3..215d37f7dde8 100644 --- a/arch/x86/include/asm/boot.h +++ b/arch/x86/include/asm/boot.h @@ -40,23 +40,40 @@ #ifdef CONFIG_X86_64 # define BOOT_STACK_SIZE 0x4000 -# define BOOT_INIT_PGT_SIZE (6*4096) -# ifdef CONFIG_RANDOMIZE_BASE /* - * Assuming all cross the 512GB boundary: - * 1 page for level4 - * (2+2)*4 pages for kernel, param, cmd_line, and randomized kernel - * 2 pages for first 2M (video RAM: CONFIG_X86_VERBOSE_BOOTUP). - * Total is 19 pages. + * Used by decompressor's startup_32() to allocate page tables for identity + * mapping of the 4G of RAM in 4-level paging mode: + * - 1 level4 table; + * - 1 level3 table; + * - 4 level2 table that maps everything with 2M pages; + * + * The additional level5 table needed for 5-level paging is allocated from + * trampoline_32bit memory. */ -# ifdef CONFIG_X86_VERBOSE_BOOTUP -# define BOOT_PGT_SIZE (19*4096) -# else /* !CONFIG_X86_VERBOSE_BOOTUP */ -# define BOOT_PGT_SIZE (17*4096) -# endif -# else /* !CONFIG_RANDOMIZE_BASE */ -# define BOOT_PGT_SIZE BOOT_INIT_PGT_SIZE -# endif +# define BOOT_INIT_PGT_SIZE (6*4096) + +/* + * Total number of page tables kernel_add_identity_map() can allocate, + * including page tables consumed by startup_32(). + * + * Worst-case scenario: + * - 5-level paging needs 1 level5 table; + * - KASLR needs to map kernel, boot_params, cmdline and randomized kernel, + * assuming all of them cross 256T boundary: + * + 4*2 level4 table; + * + 4*2 level3 table; + * + 4*2 level2 table; + * - X86_VERBOSE_BOOTUP needs to map the first 2M (video RAM): + * + 1 level4 table; + * + 1 level3 table; + * + 1 level2 table; + * Total: 28 tables + * + * Add 4 spare table in case decompressor touches anything beyond what is + * accounted above. Warn if it happens. + */ +# define BOOT_PGT_SIZE_WARN (28*4096) +# define BOOT_PGT_SIZE (32*4096) #else /* !CONFIG_X86_64 */ # define BOOT_STACK_SIZE 0x1000 -- 2.41.0 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec