From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16B8E3FBB55 for ; Tue, 2 Jun 2026 17:50:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780422616; cv=none; b=XMW3EJqJ01GeQ2R7X6R0I7pRT/xXYku/dZZunCYFSgH2f3dwTGxpA1pb4xhLivgSd+GX7I5eEEh4llTqamBtrA7Uv+tNeGJSiJEgMxQHSCv46NrNIt5wNTmYOS1RzFfk4acfdJEXR0kwngmTn7Ptg154T5wiulgZO815kc9ur2s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780422616; c=relaxed/simple; bh=gnGKJEK80pocM7qpEvuLntmHwHaEO7KTpr4JBsPaYyE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ILernSO/ewHtSpW3J6U5ies04ArX6seYh2AZBwCMWi0QhQGz8Kt0hK8TTW8ZyMf+/h4XB6GzlYt2udeQFWwdjEuSpv3ZoaFAh2311UdtYDYSN0rDRDZI75SA/q3KuhCbnN2c+4wjft2zIi43c4pcEH6TEreqvdwF4+gOUuyWTzY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=NSmFgoeh; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="NSmFgoeh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2EE481F00893; Tue, 2 Jun 2026 17:50:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780422614; bh=GO0ughoX6tIBN7Rfl1sxm5xunBsDcVGcFMz2bkL4sxs=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=NSmFgoehKN8l7Nu3qFjPX9PA1JKtgCb0OnvBI+IS4vrCZ3nnMOE6EM83u9fGdOUah 4xXfTfQB/neJj095AmqowvVzylbkcIxgu8Jk2pKdqQH4uACdBr5UHRJ7v33TdcKgvJ E88mOvzEP8uCWRy7hie9kecWXK0/gaDvIfXTC+4gHrK9GB9AMFAVwlvK3Hj4RyjLUa K105NrN4t1D0EqCEE9lcUmUiFRmpQ1XHDtQHq0KcM+4S2NxVRa2AxycnWf9bfs0mSU 7GfLZN3Lwc3KC2e6hqwYf5ZxrulvHck5J3aRNpfHCsunz8GxgLf7mXkzGHJy6HrsN8 +J9ORV2bzB5Mg== Date: Tue, 2 Jun 2026 20:50:07 +0300 From: Mike Rapoport To: Pratyush Yadav Cc: Pasha Tatashin , Alexander Graf , Muchun Song , Oscar Salvador , David Hildenbrand , Andrew Morton , Jason Miu , kexec@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 12/12] mm/hugetlb: make bootmem allocation work with KHO Message-ID: References: <20260429133928.850721-1-pratyush@kernel.org> <20260429133928.850721-13-pratyush@kernel.org> <2vxzo6i37bs6.fsf@kernel.org> <2vxzpl29dpzj.fsf@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2vxzpl29dpzj.fsf@kernel.org> On Tue, Jun 02, 2026 at 03:35:44PM +0200, Pratyush Yadav wrote: > On Sun, May 31 2026, Mike Rapoport wrote: > > > On Mon, May 25, 2026 at 05:24:09PM +0200, Pratyush Yadav wrote: > >> On Sun, May 17 2026, Mike Rapoport wrote: > >> > On Wed, Apr 29, 2026 at 03:39:14PM +0200, Pratyush Yadav wrote: > >> >> From: "Pratyush Yadav (Google)" > >> > >> So, in summary, I would like to pursue option 1 and try to make it more > >> appetizing. But I would like to at least know if you hate the "extended > >> scratch" (ignore the name) as a concept or only the code it results in. > > > > Let's retry this one :) > > > > I looked more closely, and it seems that mixing SCRATCH and SCRATCH_EXT > > should be a lesser headache than going with option 4. > > I also had some time to ruminate on this. I still think option 1 has the > most promise, but my opinion on option 4 has improved a bit. While I > still am not sure adding a 3rd phase to struct page/MM init (early -> > deferred -> KHO reserved blocks) is a good idea, I think it might not be > as bad as I first thought. Dunno... Until (if ever) we enlighten memblock_free_all()/deferred_init_pages() about KHO, small scattered reservation make memblock slower. It's hard to tell if delaying more complex initialization of the large reserved chunks until SMP is worth speedup in a few memblock operations between kho_memory_init() and the end of deferred_init_pages(). > Anyway, for now I think I will try to make option 1 more appetizing. > > Here's an idea I want to try out: I get rid of SCRATCH_EXT and mark the > free blocks as SCRATCH. For HugeTLB, I can teach the special > memblock_alloc_hugetlb_something() function to exclude scratch areas > when looking for free memory ranges. So core memblock does not get a new > memory type, and the complexity of hugepage allocation does not leak > into memblock. > > How does that sound? It sounds fine, let's see how it looks :) > > Tracking the changes in gigantic pages in hugetlb also does not seem > > something we'd like to pursue especially considering that memory from freed > > or demoted gigantic pages could be reserved. > > > > If we add a dedicated memblock_something to allocate gigantic pages, we > > can reduce branching in alloc_bootmem() to > > > > if (cma) > > do_cma() > > else > > do_memblock() > > > > For hugetlb_cma we might want to teach CMA to create pre-allocated areas > > and then it could reuse the same memblock API. This seems useful even > > regardless of KHO. > > Sorry, I don't get what you mean by this. What pre-allocated areas? When > creating CMA areas it calls cma_alloc_mem() which calls into memblock. > What would we change about this? s/teach CMA to create/teach CMA to use/ I mean that we might want to be able first allocate gigantic pages and then feed them to empty CMA areas. > -- > Regards, > Pratyush Yadav -- Sincerely yours, Mike.