From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7AD82E62A6 for ; Fri, 19 Dec 2025 09:19:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766135981; cv=none; b=Lul1r5pmULadDjdb+pdkvuJH7bu3k9jwM8XrC+qKMqeXcE7pGhV6ogPs5Q2eAXjKcvm+9sW7OHE6pXCV02KoGAF2Lkr0AXgLcsMu+aguwmHj6aAWut3JguhzjKT+WfRQc61EhaM5zBBes3IObEuYbPrF1CBwcHsq9ynN+4/gx7w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766135981; c=relaxed/simple; bh=PO6v/fWv1Mc8WnsslILLXQGaglGO94u1Qqdt6Qqrmfo=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=sbOSzF6eaAFyZhymuL69IjCw6uzispW+4rQq0a0BlxbDYwiSsly76wFZkcpg4J26kPwM8lFlF1hNlu1Xpr/W/hyh8A7bVQA36hINE1jB3BO0iBRtDBWLxiR6/sT2H2tpIfNg60d6ovpCljtzbTMkiK1FtknM1kUE5i4m+y+mRF4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gGRZKEmB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gGRZKEmB" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 47BB0C4CEF1; Fri, 19 Dec 2025 09:19:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1766135981; bh=PO6v/fWv1Mc8WnsslILLXQGaglGO94u1Qqdt6Qqrmfo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=gGRZKEmBCsW1m5GIxZjWaZh9bzgvAtULm9UOSbnMH0muoSNjNpkZn2qEjD2VIMB1t VJN04JQCI7MEjUfFHNCieyAFihZpXWzTTqnzV4zzOfrdgRBXchL0QtsDfygAeV9Zer 0eOB0sQNomMcB6Eki4Os2O4lzyVaY/ldcEAF40Jlu4bAgL0F5taqxBKSy/bbFK1Yoi t5H7gdsY8GyZml2iSYhg9LdJ1Vrh8qS6IP7K7ou7AQbkg8o7sRzoj8eqjsYwVRoerS h+I8QLbXGsT1+d71RygPScj38IKjgj1Pwbnh+vznbyRgkxMSXE09rB6J7b2EA4F6xQ 9P5Yb5hezZGhQ== Date: Fri, 19 Dec 2025 11:19:34 +0200 From: Mike Rapoport To: Pasha Tatashin Cc: Evangelos Petrongonas , Pratyush Yadav , Alexander Graf , Andrew Morton , Jason Miu , linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-mm@kvack.org, nh-open-source@amazon.com Subject: Re: [PATCH] kho: add support for deferred struct page init Message-ID: References: <20251216084913.86342-1-epetron@amazon.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Tue, Dec 16, 2025 at 10:36:01AM -0500, Pasha Tatashin wrote: > On Tue, Dec 16, 2025 at 10:19 AM Mike Rapoport wrote: > > > > On Tue, Dec 16, 2025 at 10:05:27AM -0500, Pasha Tatashin wrote: > > > > > +static struct page *__init kho_get_preserved_page(phys_addr_t phys, > > > > > + unsigned int order) > > > > > +{ > > > > > + unsigned long pfn = PHYS_PFN(phys); > > > > > + int nid = early_pfn_to_nid(pfn); > > > > > + > > > > > + for (int i = 0; i < (1 << order); i++) > > > > > + init_deferred_page(pfn + i, nid); > > > > > > > > This will skip pages below node->first_deferred_pfn, we need to use > > > > __init_page_from_nid() here. > > > > > > Mike, but those struct pages should be initialized early anyway. If > > > they are not yet initialized we have a problem, as they are going to > > > be re-initialized later. > > > > Can say I understand your point. Which pages should be initialized earlt? > > All pages below node->first_deferred_pfn. > > > And which pages will be reinitialized? > > kho_memory_init() is called after free_area_init() (which calls > memmap_init_range to initialize low memory struct pages). So, if we > use __init_page_from_nid() as suggested, we would be blindly running > __init_single_page() again on those low-memory pages that > memmap_init_range() already set up. This would cause double > initialization and corruptions due to losing the order information. > > > > > > + > > > > > + return pfn_to_page(pfn); > > > > > +} > > > > > + > > > > > static void __init deserialize_bitmap(unsigned int order, > > > > > struct khoser_mem_bitmap_ptr *elm) > > > > > { > > > > > @@ -449,7 +466,7 @@ static void __init deserialize_bitmap(unsigned int order, > > > > > int sz = 1 << (order + PAGE_SHIFT); > > > > > phys_addr_t phys = > > > > > elm->phys_start + (bit << (order + PAGE_SHIFT)); > > > > > - struct page *page = phys_to_page(phys); > > > > > + struct page *page = kho_get_preserved_page(phys, order); > > > > > > > > I think it's better to initialize deferred struct pages later in > > > > kho_restore_page. deserialize_bitmap() runs before SMP and it already does > > > > > > The KHO memory should still be accessible early in boot, right? > > > > The memory is accessible. And we anyway should not use struct page for > > preserved memory before kho_restore_{folio,pages}. > > This makes sense, what happens if someone calls kho_restore_folio() > before deferred pages are initialized? That's fine, because this memory is still memblock_reserve()ed and deferred init skips reserved ranges. There is a problem however with the calls to kho_restore_{pages,folio}() after memblock is gone because we can't use early_pfn_to_nid() then. I think we can start with Evangelos' approach that initializes struct pages at deserialize time and then we'll see how to optimize it. -- Sincerely yours, Mike.