From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E56CC433FE for ; Mon, 4 Oct 2021 10:52:10 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E2C2861222 for ; Mon, 4 Oct 2021 10:52:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E2C2861222 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=uJIdTwuMKNZX0PlLYN+YdC/991BtyCiGE/mXqcbnqVQ=; b=zD5BWPAcSRvKZH agh9ziOc7jT5BwSC/LE+LIsztaPs4MMq52QW/PlkQBbjm3ZM1o/Y66nhQhkgezHamcoVOoa6sIKxc ER0H44gj6HCr5rPNyftWFHVAj9grtKXcTRb22aimNwTKL5M5p5xdxZMKN0CBxN799ZKK2cPk3ch95 GPddYFlWiwr0QsHel1mOPLxmD8RvOHqPlQUQfzQtXR/hUHf+8AX8lG/7uaSIV/j+AynkDJkrabMNr OzcJ5w45dsD24QhrJGdzL9QaiPEkex1wqPyD4pDt0+TC8K4gVKoNwBrII4Xl//Vwf5iwugLD0+ENm Mj2mAkzsLE96i6zS4lzw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mXLY5-0060rR-9q; Mon, 04 Oct 2021 10:50:21 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mXLY0-0060qn-QT for linux-arm-kernel@lists.infradead.org; Mon, 04 Oct 2021 10:50:18 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 87EE71FB; Mon, 4 Oct 2021 03:50:09 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [10.57.27.218]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BE2B93F766; Mon, 4 Oct 2021 03:50:07 -0700 (PDT) Date: Mon, 4 Oct 2021 11:49:58 +0100 From: Mark Rutland To: Anshuman Khandual Cc: linux-arm-kernel@lists.infradead.org, Catalin Marinas , Will Deacon , James Morse , Marc Zyngier , linux-kernel@vger.kernel.org Subject: Re: [PATCH V2] arm64/mm: Fix idmap on [16K|36VA|48PA] Message-ID: <20211004104947.GA4430@C02TD0UTHF1T.local> References: <1632807225-20189-1-git-send-email-anshuman.khandual@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1632807225-20189-1-git-send-email-anshuman.khandual@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211004_035017_005972_8CD013ED X-CRM114-Status: GOOD ( 34.05 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Anshuman, On Tue, Sep 28, 2021 at 11:03:45AM +0530, Anshuman Khandual wrote: > When creating the idmap, the kernel may add one extra level to idmap memory > outside the VA range. But for [16K|36VA|48PA], we need two levels to reach > 48 bits. If the bootloader places the kernel in memory above (1 << 46), the > kernel will fail to enable the MMU. Although we are not aware of a platform > where this happens, it is worth to accommodate such scenarios and prevent a > possible kernel crash. I think it's worth noting here that ARM64_VA_BITS_36 depends on EXPERT, so very few people are likely to be using this configuration. > Lets fix this problem by carefully analyzing existing VA_BITS with respect > to maximum possible mapping with the existing PGDIR level i.e (PGDIR_SHIFT > + PAGE_SHIFT - 3) and then evaluating how many extra page table levels are > required to accommodate the reduced idmap_t0sz to map __idmap_text_end. > > Cc: Catalin Marinas > Cc: Will Deacon > Cc: James Morse > Cc: Marc Zyngier > Cc: Mark Rutland > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-kernel@vger.kernel.org > Fixes: 215399392fe4 ("arm64: 36 bit VA") > Signed-off-by: Anshuman Khandual > --- > This applies on v5.15-rc3. > > This is a different approach as compared to V1 which still applies on the > latest mainline. Besides this enables all upcoming FEAT_LPA2 combinations > as well. Please do suggest which approach would be preferred. > - Anshuman > > V1: https://lore.kernel.org/all/1627879359-30303-1-git-send-email-anshuman.khandual@arm.com/ > RFC: https://lore.kernel.org/lkml/1627019894-14819-1-git-send-email-anshuman.khandual@arm.com/ If we need something to backport, I'm not opposed to taking one of these patches (and as v1 is simpler, I'd prefer that), but I think either approach is further bodging around the `map_memory` macro not being a great fit for the idmap creation, and it would be better to rework the structure of the pagetable creation code to do the right thing from the outset. Catalin, Will, do you have any preference as to having a backportable fix for this? Ignoring backports, I'd prefer if we could refactor things such that we decouple the `idmap_pg_dir` creation from the `init_pg_dir` creation, and create the idmap in terms of the architectural levels rather than pgd/p4d/pud/pmd/pte, so that we can consistently create the idmap with at least 48 bits of VA. Pseudo-code wise, I'd like something that looks like: create_idmap(...) { idmap_va_bits = 48; idmap_t0size = TCR_T0SZ(48); if (need_52_bit_va(__idmap_text_start)) { if (!supports_52bit_va()) { some_early_spin_loop(); } idmap_va_bits = 52; idmap_t0size = TCR_T0SZ(52); } if (need_table_level(idmap_va_bits, -1)) create_table_level(-1, ...); if (need_table_level(idmap_va_bits, 0)) create_table_level(0, ...); if (need_table_level(idmap_va_bits, 1)) create_table_level(1, ...); if (need_table_level(idmap_va_bits, 2)) create_table_level(2, ...); create_table_level(3, ...); } ... which I think would be much easier to reason about consistently. How does that sound to you? I've pushed some preparatory rework out to my arm64/pgtable/idmap branch, splitting out a __create_idmap_tables() function (and ensuring that idmap_t0sz doesn't get silently overridden elsewhere): https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/pgtable/idmap git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/pgtable/idmap ... but I haven't had the chance to do the actual rework of the idmap creation code. I can send that as a series if that's helpful. Thanks, Mark. > arch/arm64/include/asm/assembler.h | 9 ++++++++ > arch/arm64/kernel/head.S | 46 +++++++++++++++++++++++--------------- > 2 files changed, 37 insertions(+), 18 deletions(-) > > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h > index bfa5840..e5b5d3a 100644 > --- a/arch/arm64/include/asm/assembler.h > +++ b/arch/arm64/include/asm/assembler.h > @@ -25,6 +25,15 @@ > #include > #include > > + .macro shift_to_ptrs, ptrs, shift, tmp, tmp1 > + ldr_l \tmp1, idmap_t0sz > + add \tmp1, \tmp1, \shift > + mov \tmp, #64 > + sub \tmp, \tmp, \tmp1 > + mov \ptrs, #1 > + lsr \ptrs, \ptrs, \tmp > + .endm > + > /* > * Provide a wxN alias for each wN register so what we can paste a xN > * reference after a 'w' to obtain the 32-bit version. > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S > index 1796245..b93d50d 100644 > --- a/arch/arm64/kernel/head.S > +++ b/arch/arm64/kernel/head.S > @@ -328,30 +328,40 @@ SYM_FUNC_START_LOCAL(__create_page_tables) > dmb sy > dc ivac, x6 // Invalidate potentially stale cache line > > -#if (VA_BITS < 48) > #define EXTRA_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3) > -#define EXTRA_PTRS (1 << (PHYS_MASK_SHIFT - EXTRA_SHIFT)) > - > - /* > - * If VA_BITS < 48, we have to configure an additional table level. > - * First, we have to verify our assumption that the current value of > - * VA_BITS was chosen such that all translation levels are fully > - * utilised, and that lowering T0SZ will always result in an additional > - * translation level to be configured. > - */ > -#if VA_BITS != EXTRA_SHIFT > +#define EXTRA_SHIFT_1 (EXTRA_SHIFT + PAGE_SHIFT - 3) > +#if (VA_BITS > EXTRA_SHIFT) > #error "Mismatch between VA_BITS and page size/number of translation levels" > #endif > > - mov x4, EXTRA_PTRS > +#if (VA_BITS == EXTRA_SHIFT) > + mov x6, #TCR_T0SZ(VA_BITS_MIN) > + sub x6, x6, x5 > + cmp x6, #(PAGE_SHIFT - 3) > + b.gt 8f > + > + shift_to_ptrs x4, EXTRA_SHIFT, x5, x6 > create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6 > -#else > - /* > - * If VA_BITS == 48, we don't have to configure an additional > - * translation level, but the top-level table has more entries. > - */ > - mov x4, #1 << (PHYS_MASK_SHIFT - PGDIR_SHIFT) > + b 1f > +8: > + shift_to_ptrs x4, EXTRA_SHIFT_1, x5, x6 > + create_table_entry x0, x3, EXTRA_SHIFT_1, x4, x5, x6 > + > + mov x4, PTRS_PER_PTE > + create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6 > +#elif (VA_BITS < EXTRA_SHIFT) > + mov x6, #64 > + sub x6, x6, x5 > + cmp x6, EXTRA_SHIFT > + b.eq 1f > + b.gt 9f > + > + shift_to_ptrs x4, PGDIR_SHIFT, x5, x6 > str_l x4, idmap_ptrs_per_pgd, x5 > + b 1f > +9: > + shift_to_ptrs x4, EXTRA_SHIFT, x5, x6 > + create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6 > #endif > 1: > ldr_l x4, idmap_ptrs_per_pgd > -- > 2.7.4 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel