From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 402DF3803FC for ; Mon, 19 Jan 2026 15:15:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768835731; cv=none; b=TNIP5959qtRbZ6EdnqTq10f2MMMYDp1hoWyE5blG8ABIh1uMWfE7Dqn7tTG+FpKy+MxSlA+q2/Sbha6kVTANDgWSe27ykArOJKhEH2d6gKJrC9jA0MrkOaZ3iYQjV0+SHIaUUCo9VJ+S4NR5tSsMSyrZqUi6reF9xAfNHWaNvZk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768835731; c=relaxed/simple; bh=ERVCOL+UrRNglZlm0SAA4pm43qoW2KUHl1G7b7RQrY8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=RQCQI4v5fMzwnqETrQfTgby3al+kMApnbt9xaqLkINiEY7dphBbaAJP4DrRzdDUlmsrPRqMT/Na20RI4NoQZNr84bUYwpKxKhrFkqcYgHOggRSDuzq/7zJPRFG1ByOq0j+bgv/vYwDoL1T5oxJCihctyqjbnqYqEUI0/yrWpTJc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IiLz2uJP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IiLz2uJP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 666B8C4AF09; Mon, 19 Jan 2026 15:15:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1768835730; bh=ERVCOL+UrRNglZlm0SAA4pm43qoW2KUHl1G7b7RQrY8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=IiLz2uJPg4v16rV7EbV9NJZ++IEiMTIJDnjo5f+8tkh6BXJHHLyW8Q7fA2VcvmCqC 8g1pCeBwhX+06cCVdH/FGvqA9TBTLTmios+dZXt0t5gkQaKziniYEQepSqMmaLXG58 0onKhf2IB8CJ9RzIDcRCxpr3MAPEJMkQquCJw+xsCOLTmLRDQH2l/cKtGFH8UFWx4S ntO/fsBgGVAfbbJHo1MTGSMLNY45s8Q6qr0l0oavg2ldKerftm3vYLFOnII30RJy/9 u59Ohpkpc829PRjnpHjh4IwvWp+seH9A9LhBihYogK3G0PalK2iKroa6vlzo3jMcIj DluB90s/MhwpA== Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfauth.phl.internal (Postfix) with ESMTP id 8CFD9F4006A; Mon, 19 Jan 2026 10:15:29 -0500 (EST) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-01.internal (MEProxy); Mon, 19 Jan 2026 10:15:29 -0500 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddufeejkeelucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggujgesthdtredttddtvdenucfhrhhomhepmfhirhihlhcu ufhhuhhtshgvmhgruhcuoehkrghssehkvghrnhgvlhdrohhrgheqnecuggftrfgrthhtvg hrnhepueeijeeiffekheeffffftdekleefleehhfefhfduheejhedvffeluedvudefgfek necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepkhhirh hilhhlodhmvghsmhhtphgruhhthhhpvghrshhonhgrlhhithihqdduieduudeivdeiheeh qddvkeeggeegjedvkedqkhgrsheppehkvghrnhgvlhdrohhrghesshhhuhhtvghmohhvrd hnrghmvgdpnhgspghrtghpthhtohepfeekpdhmohguvgepshhmthhpohhuthdprhgtphht thhopehmuhgthhhunhdrshhonhhgsehlihhnuhigrdguvghvpdhrtghpthhtohepuggrvh hiugeskhgvrhhnvghlrdhorhhgpdhrtghpthhtoheprghkphhmsehlihhnuhigqdhfohhu nhgurghtihhonhdrohhrghdprhgtphhtthhopeifihhllhihsehinhhfrhgruggvrggurd horhhgpdhrtghpthhtohepuhhsrghmrggrrhhifheigedvsehgmhgrihhlrdgtohhmpdhr tghpthhtohepfhhvughlsehgohhoghhlvgdrtghomhdprhgtphhtthhopehoshgrlhhvrg guohhrsehsuhhsvgdruggvpdhrtghpthhtoheprhhpphhtsehkvghrnhgvlhdrohhrghdp rhgtphhtthhopehvsggrsghkrgesshhushgvrdgtii X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 19 Jan 2026 10:15:27 -0500 (EST) Date: Mon, 19 Jan 2026 15:15:22 +0000 From: Kiryl Shutsemau To: Muchun Song Cc: "David Hildenbrand (Red Hat)" , Andrew Morton , Matthew Wilcox , Usama Arif , Frank van der Linden , Oscar Salvador , Mike Rapoport , Vlastimil Babka , Lorenzo Stoakes , Zi Yan , Baoquan He , Michal Hocko , Johannes Weiner , Jonathan Corbet , kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Subject: Re: [PATCHv3 10/15] mm/hugetlb: Remove fake head pages Message-ID: References: <20260115144604.822702-1-kas@kernel.org> <20260115144604.822702-11-kas@kernel.org> <30ae1623-63f9-4729-9c19-9b0a9a0ae9f1@kernel.org> <53980726-C7F0-4648-99E9-89E10645F2E7@linux.dev> <0F1C93F3-9A1A-4929-9157-589CF8C0588D@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0F1C93F3-9A1A-4929-9157-589CF8C0588D@linux.dev> On Sat, Jan 17, 2026 at 10:38:48AM +0800, Muchun Song wrote: > > > > On Jan 16, 2026, at 23:52, Kiryl Shutsemau wrote: > > > > On Fri, Jan 16, 2026 at 10:38:02AM +0800, Muchun Song wrote: > >> > >> > >>> On Jan 16, 2026, at 01:23, Kiryl Shutsemau wrote: > >>> > >>> On Thu, Jan 15, 2026 at 05:49:43PM +0100, David Hildenbrand (Red Hat) wrote: > >>>> On 1/15/26 15:45, Kiryl Shutsemau wrote: > >>>>> HugeTLB Vmemmap Optimization (HVO) reduces memory usage by freeing most > >>>>> vmemmap pages for huge pages and remapping the freed range to a single > >>>>> page containing the struct page metadata. > >>>>> > >>>>> With the new mask-based compound_info encoding (for power-of-2 struct > >>>>> page sizes), all tail pages of the same order are now identical > >>>>> regardless of which compound page they belong to. This means the tail > >>>>> pages can be truly shared without fake heads. > >>>>> > >>>>> Allocate a single page of initialized tail struct pages per NUMA node > >>>>> per order in the vmemmap_tails[] array in pglist_data. All huge pages > >>>>> of that order on the node share this tail page, mapped read-only into > >>>>> their vmemmap. The head page remains unique per huge page. > >>>>> > >>>>> This eliminates fake heads while maintaining the same memory savings, > >>>>> and simplifies compound_head() by removing fake head detection. > >>>>> > >>>>> Signed-off-by: Kiryl Shutsemau > >>>>> --- > >>>>> include/linux/mmzone.h | 16 ++++++++++++++- > >>>>> mm/hugetlb_vmemmap.c | 44 ++++++++++++++++++++++++++++++++++++++++-- > >>>>> mm/sparse-vmemmap.c | 44 ++++++++++++++++++++++++++++++++++-------- > >>>>> 3 files changed, 93 insertions(+), 11 deletions(-) > >>>>> > >>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >>>>> index 322ed4c42cfc..2ee3eb610291 100644 > >>>>> --- a/include/linux/mmzone.h > >>>>> +++ b/include/linux/mmzone.h > >>>>> @@ -82,7 +82,11 @@ > >>>>> * currently expect (see CONFIG_HAVE_GIGANTIC_FOLIOS): with hugetlb, we expect > >>>>> * no folios larger than 16 GiB on 64bit and 1 GiB on 32bit. > >>>>> */ > >>>>> -#define MAX_FOLIO_ORDER get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G) > >>>>> +#ifdef CONFIG_64BIT > >>>>> +#define MAX_FOLIO_ORDER (34 - PAGE_SHIFT) > >>>>> +#else > >>>>> +#define MAX_FOLIO_ORDER (30 - PAGE_SHIFT) > >>>>> +#endif > >>>> > >>>> Where do these magic values stem from, and how do they related to the > >>>> comment above that clearly spells out 16G vs. 1G ? > >>> > >>> This doesn't change the resulting value: 1UL << 34 is 16GiB, 1UL << 30 > >>> is 1G. Subtract PAGE_SHIFT to get the order. > >>> > >>> The change allows the value to be used to define NR_VMEMMAP_TAILS which > >>> is used specify size of vmemmap_tails array. > >> > >> How about allocate ->vmemmap_tails array dynamically? If sizeof of struct > >> page is not power of two, then we could optimize away this array. Besides, > >> the original MAX_FOLIO_ORDER could work as well. > > > > This is tricky. > > > > We need vmemmap_tails array to be around early, in > > hugetlb_vmemmap_init_early(). By the time, we don't have slab > > functional yet. > > I mean zero-size array at the end of pg_data_t, no slab is needed. For !NUMA, the struct is in BSS. See contig_page_data. Dynamic array won't fly there. -- Kiryl Shutsemau / Kirill A. Shutemov