From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 39A99C4167B for ; Wed, 13 Dec 2023 14:40:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=wNJFACv4KAzsN+PL9PWM6bk8QI6jdLdMkT3xbBDnKEg=; b=i1qwG9t3yNHdR5 6kkuEYVPk35ZI3kGnZ/f6PHQoMkwrnV2Gx9sQaNupPG5zA/c1zfuPhs3ZecUqnw9AGspQjr+YdkKf HdDGEB9ka5ptlwwylQN5UExBe08yE1IoyFNpXOamJiX8PfgcGqCdLmaL6ao+3Og7N/3hSDdW6YZZg i0PjXr3lQmNBh7ceQS/b8od0KyTpB1YcTmsp/cpy+ZphsRILlKdcB+4Y8lqx0YAK1jbVbVhJADLJe 8hbWSdZ6M9GyIc7KitrkTR7kJeQI/lq8USIjPS+LkaGvcZW+iWD37G5ophldS1oV+3w875wGK2RFN 9PV5N4D50FGVUKww3+Tw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDQOt-00F4IP-0g; Wed, 13 Dec 2023 14:39:51 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDQOq-00F4Hg-2A for linux-arm-kernel@lists.infradead.org; Wed, 13 Dec 2023 14:39:50 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E9B41C15; Wed, 13 Dec 2023 06:40:31 -0800 (PST) Received: from FVFF77S0Q05N (unknown [10.57.42.80]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 439BA3F738; Wed, 13 Dec 2023 06:39:44 -0800 (PST) Date: Wed, 13 Dec 2023 14:39:30 +0000 From: Mark Rutland To: Ard Biesheuvel Cc: Ard Biesheuvel , linux-arm-kernel@lists.infradead.org, Catalin Marinas , Will Deacon , Marc Zyngier Subject: Re: [PATCH v7 5/7] arm64: vmemmap: Avoid base2 order of struct page size to dimension region Message-ID: References: <20231213084024.2367360-9-ardb@google.com> <20231213084024.2367360-14-ardb@google.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231213_063948_806415_C29BDB8A X-CRM114-Status: GOOD ( 40.30 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Dec 13, 2023 at 03:09:54PM +0100, Ard Biesheuvel wrote: > On Wed, 13 Dec 2023 at 14:49, Mark Rutland wrote: > > > > Hi Ard, > > > > On Wed, Dec 13, 2023 at 09:40:30AM +0100, Ard Biesheuvel wrote: > > > From: Ard Biesheuvel > > > > > > The placement and size of the vmemmap region in the kernel virtual > > > address space is currently derived from the base2 order of the size of a > > > struct page. This makes for nicely aligned constants with lots of > > > leading 0xf and trailing 0x0 digits, but given that the actual struct > > > pages are indexed as an ordinary array, this resulting region is > > > severely overdimensioned when the size of a struct page is just over a > > > power of 2. > > > > > > This doesn't matter today, but once we enable 52-bit virtual addressing > > > for 4k pages configurations, the vmemmap region may take up almost half > > > of the upper VA region with the current struct page upper bound at 64 > > > bytes. And once we enable KMSAN or other features that push the size of > > > a struct page over 64 bytes, we will run out of VMALLOC space entirely. > > > > > > So instead, let's derive the region size from the actual size of a > > > struct page, and place the entire region 1 GB from the top of the VA > > > space, where it still doesn't share any lower level translation table > > > entries with the fixmap. > > > > > > Acked-by: Mark Rutland > > > Signed-off-by: Ard Biesheuvel > > > --- > > > arch/arm64/include/asm/memory.h | 8 ++++---- > > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h > > > index 2745bed8ae5b..b49575a92afc 100644 > > > --- a/arch/arm64/include/asm/memory.h > > > +++ b/arch/arm64/include/asm/memory.h > > > @@ -30,8 +30,8 @@ > > > * keep a constant PAGE_OFFSET and "fallback" to using the higher end > > > * of the VMEMMAP where 52-bit support is not available in hardware. > > > */ > > > -#define VMEMMAP_SHIFT (PAGE_SHIFT - STRUCT_PAGE_MAX_SHIFT) > > > -#define VMEMMAP_SIZE ((_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET) >> VMEMMAP_SHIFT) > > > +#define VMEMMAP_RANGE (_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET) > > > +#define VMEMMAP_SIZE ((VMEMMAP_RANGE >> PAGE_SHIFT) * sizeof(struct page)) > > > > > > /* > > > * PAGE_OFFSET - the virtual address of the start of the linear map, at the > > > @@ -47,8 +47,8 @@ > > > #define MODULES_END (MODULES_VADDR + MODULES_VSIZE) > > > #define MODULES_VADDR (_PAGE_END(VA_BITS_MIN)) > > > #define MODULES_VSIZE (SZ_2G) > > > -#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT))) > > > -#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE) > > > +#define VMEMMAP_START (VMEMMAP_END - VMEMMAP_SIZE) > > > +#define VMEMMAP_END (-UL(SZ_1G)) > > > #define PCI_IO_START (VMEMMAP_END + SZ_8M) > > > #define PCI_IO_END (PCI_IO_START + PCI_IO_SIZE) > > > #define FIXADDR_TOP (-UL(SZ_8M)) > > > > I realise I've acked this already, but my big concern here is still that it's > > hard to see why these don't overlap (though the assert in fixmap.c will save > > us). Usually we try to make that clear by construction, and I think we can do > > that here with something like: > > > > | #define GUARD_VA_SIZE (UL(SZ_8M)) > > | > > | #define FIXADDR_TOP (-GUARD_VA_SIZE) > > | #define FIXADDR_SIZE_MAX SZ_8M > > | #define FIXADDR_START_MIN (FIXADDR_TOP - FIXADDR_SIZE_MAX) > > | > > | #define PCI_IO_END (FIXADDR_START_MIN - GUARD_VA_SIZE) > > | #define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE) > > | > > | #define VMEMMAP_END (ALIGN_DOWN(PCI_IO_START - GUARD_VA_SIZE, SZ_1G)) > > | #define VMEMMAP_START (VMEMMAP_END - VMEMMAP_SIZE) > > > > ... and in fixmap.h have: > > > > /* Ensure the estimate in memory.h was big enough */ > > static_assert(FIXADDR_SIZE_MAX > FIXADDR_SIZE); > > > > I might be missing some reason why we can't do that; I locally tried the above > > atop this series with defconfig+4K and defcconfig+64K, and both build and boot > > without sisue. > > > > I am really struggling to understand what the issue is that you are > trying to solve here. > > What I am proposing is > > #define VMEMMAP_END (-UL(SZ_1G)) > #define PCI_IO_START (VMEMMAP_END + SZ_8M) > #define PCI_IO_END (PCI_IO_START + PCI_IO_SIZE) > #define FIXADDR_TOP (-UL(SZ_8M)) > > and in fixmap.c > > static_assert(FIXADDR_TOT_START > PCI_IO_END); > > (which sadly has to live outside of asm/memory.h for reasons of #include hell). > > This leaves the top 1G for PCI I/O plus fixmap, and ensures that the > latter does not collide with the former. Sure, I get that (and that #include hell is why I have FIXADDR_SIZE_MAX above). What I'm trying to do is make the relationship between those regions clear *in one place*, so that this is easier to follow, as that was one of the things I found painful during review. That said, it's not clear cut, and I'll happily defer to the judgement of Will and Catalin. For the series as-is: Acked-by: Mark Rutland Will/Catalin, please let me know your preference on the above. Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel