From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BB3A28506C for ; Thu, 19 Feb 2026 15:50:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771516229; cv=none; b=BabTKJ+J7mxZnlhbkCGjztXUsc/kjpq6r3CVCQHnO6NpT/TSyGO8+OpvF6kY6UA/zqNrGoFlbM+pPgBftmjC2WkZBiJiTKiireKcp2r+mzS7st5AL4vt8NhJYdNTd55gmF+gssMkjEOLqkftYwZZQBhurs5nyhbYymIO80qOkOM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771516229; c=relaxed/simple; bh=QRaXAPNVSdqHh1S3a+TpsEg/J3A6konYPEMlsHm79V4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=queVeCQGa8mvsT6pyU7anGKI9DFJiFeHBaLo45QSW43h74NPmsNI1hGLX/PdiyaAD3WuWUcMLKr8mP3rZ/X3h+mv919aen+n4ZUoxKOKmeotbwQWAC3okenW7m9huxzgB/U+xGa5FqnjtGtw6RtKUjyKiyWuogBPLLDxMgln9Cw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Jj3lq44w; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Jj3lq44w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 696B4C116D0; Thu, 19 Feb 2026 15:50:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771516229; bh=QRaXAPNVSdqHh1S3a+TpsEg/J3A6konYPEMlsHm79V4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Jj3lq44w77ij0vQOGgZsk2x0dwmZe8FvOowQalSx383dfUq1MssLmoThYH+HMS3JM 3/9pPoiRD9VNP63KlWNK6MixaiMPZz0+j1+jVglhbLPbj3ZWdpU6dmzjUbPZgkBZp0 ZPwRY6iwdG16THTJKl7RVUl2Va7QwcM5B+GmIgMGlonMsDmNVnGAgD7J+IndqCElpY weuOA64smPdIfWNAiyIzSy6gTgUfco9LImsR0jicW0oFkcll9B3VOP2S9FIH7bhefj /sKtX/w1bGHlUe8kJ9D6l1Qku1pxFP9O4MpvjoDKXX/KraUTou83YOSUXvQ8H4x3/r eRXzrbm5DOj3A== Received: from phl-compute-04.internal (phl-compute-04.internal [10.202.2.44]) by mailfauth.phl.internal (Postfix) with ESMTP id 5CA2BF4006A; Thu, 19 Feb 2026 10:50:27 -0500 (EST) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-04.internal (MEProxy); Thu, 19 Feb 2026 10:50:27 -0500 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvvdehleefucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggujgesthdtredttddtvdenucfhrhhomhepmfhirhihlhcu ufhhuhhtshgvmhgruhcuoehkrghssehkvghrnhgvlhdrohhrgheqnecuggftrfgrthhtvg hrnhepueeijeeiffekheeffffftdekleefleehhfefhfduheejhedvffeluedvudefgfek necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepkhhirh hilhhlodhmvghsmhhtphgruhhthhhpvghrshhonhgrlhhithihqdduieduudeivdeiheeh qddvkeeggeegjedvkedqkhgrsheppehkvghrnhgvlhdrohhrghesshhhuhhtvghmohhvrd hnrghmvgdpnhgspghrtghpthhtohepfeegpdhmohguvgepshhmthhpohhuthdprhgtphht thhopehpfhgrlhgtrghtohesshhushgvrdguvgdprhgtphhtthhopehlshhfqdhptgeslh hishhtshdrlhhinhhugidqfhhouhhnuggrthhiohhnrdhorhhgpdhrtghpthhtoheplhhi nhhugidqmhhmsehkvhgrtghkrdhorhhgpdhrtghpthhtohepgiekieeskhgvrhhnvghlrd horhhgpdhrtghpthhtoheplhhinhhugidqkhgvrhhnvghlsehvghgvrhdrkhgvrhhnvghl rdhorhhgpdhrtghpthhtoheprghkphhmsehlihhnuhigqdhfohhunhgurghtihhonhdroh hrghdprhgtphhtthhopegurghvihgusehkvghrnhgvlhdrohhrghdprhgtphhtthhopeht ghhlgieslhhinhhuthhrohhnihigrdguvgdprhgtphhtthhopehmihhnghhosehrvgguhh grthdrtghomh X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 19 Feb 2026 10:50:25 -0500 (EST) Date: Thu, 19 Feb 2026 15:50:19 +0000 From: Kiryl Shutsemau To: Pedro Falcato Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , David Hildenbrand , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , Matthew Wilcox , Johannes Weiner , Usama Arif Subject: Re: [LSF/MM/BPF TOPIC] 64k (or 16k) base page size on x86 Message-ID: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Thu, Feb 19, 2026 at 03:33:47PM +0000, Pedro Falcato wrote: > On Thu, Feb 19, 2026 at 03:08:51PM +0000, Kiryl Shutsemau wrote: > > No, there's no new hardware (that I know of). I want to explore what page size > > means. > > > > The kernel uses the same value - PAGE_SIZE - for two things: > > > > - the order-0 buddy allocation size; > > > > - the granularity of virtual address space mapping; > > > > I think we can benefit from separating these two meanings and allowing > > order-0 allocations to be larger than the virtual address space covered by a > > PTE entry. > > > > Doesn't this idea make less sense these days, with mTHP? Simply by toggling one > of the entries in /sys/kernel/mm/transparent_hugepage. mTHP is still best effort. This is way you don't need to care about fragmentation, you will get your 64k page as long as you have free memory. > > The main motivation is scalability. Managing memory on multi-terabyte > > machines in 4k is suboptimal, to say the least. > > > > Potential benefits of the approach (assuming 64k pages): > > > > - The order-0 page size cuts struct page overhead by a factor of 16. From > > ~1.6% of RAM to ~0.1%; > > > > - TLB wins on machines with TLB coalescing as long as mapping is naturally > > aligned; > > > > - Order-5 allocation is 2M, resulting in less pressure on the zone lock; > > > > - 1G pages are within possibility for the buddy allocator - order-14 > > allocation. It can open the road to 1G THPs. > > > > - As with THP, fewer pages - less pressure on the LRU lock; > > We could perhaps add a way to enforce a min_order globally on the page cache, > as a way to address it. Raising min_order is not free. I puts more pressure on page allocator. > There are some points there which aren't addressed by mTHP work in any way > (1G THPs for one), others which are being addressed separately (memdesc work > trying to cut down on struct page overhead). > > (I also don't understand your point about order-5 allocation, AFAIK pcp will > cache up to COSTLY_ORDER (3) and PMD order, but I'm probably not seeing the > full picture) With higher base page size, page allocator doesn't need to do as much work to merge/split buddy pages. So serving the same 2M as order-5 is cheaper than order-9. -- Kiryl Shutsemau / Kirill A. Shutemov