From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9F6D0CA0FE8 for ; Mon, 25 Aug 2025 16:17:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ECBF88E004D; Mon, 25 Aug 2025 12:17:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA3FC8E0038; Mon, 25 Aug 2025 12:17:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB9F38E004D; Mon, 25 Aug 2025 12:17:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C7DF58E0038 for ; Mon, 25 Aug 2025 12:17:23 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5ECEE591C9 for ; Mon, 25 Aug 2025 16:17:23 +0000 (UTC) X-FDA: 83815784766.10.06E5902 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf13.hostedemail.com (Postfix) with ESMTP id 8735D2000E for ; Mon, 25 Aug 2025 16:17:21 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Z1X+B0ZG; spf=pass (imf13.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1756138641; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yaUcnQjozmvTYDYURkWMjBxCsSnXy41d7hzIH1GSQGQ=; b=N8zylL07qP+p1k9hZYnZIjGVMHHjwHP+o5AtaNv67ncHgS4mkRzPQTGCEvWqJpWNB8/aNZ kq695yGQnzo6+JMfZfFnyTd5Z5tAkhTtj+RLGpxVHNhbPwMoK/H/InHrGvTazHDKwCdpUa euFGw4pp6oto1JBYcseFBVk3mJTXAck= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Z1X+B0ZG; spf=pass (imf13.hostedemail.com: domain of rppt@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1756138641; a=rsa-sha256; cv=none; b=kOtCu5ylAGzTztChdpuocSbGDWmFXUD+m6p3kyeh4cWzdw2LQWQyr+PwE/Xl3C7dQ9ZU4r JjPB2+rzp/0Nps7RV6svAwBvf5sWsJgbE8yVdC40qDsb5X0tZu8qMwBIPgqIMjLdlHehM8 ZDzDrdnBV48pn2Y/hSNWOWyGSjZSrpo= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 523095C5FB5; Mon, 25 Aug 2025 16:17:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3323DC4CEED; Mon, 25 Aug 2025 16:17:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1756138640; bh=FANVRTF6GTKZEwwkEPlW/tWs0ubXGJ7R1OcvtRKQmn8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Z1X+B0ZGv7uHOoHuOLd4SJt16FexWEbZsjEQFQqRUrapr3OCpVgaCrasYjPdbbOyy eytOcCBL+pDQ3AWWdM11/NupldPVN7n787Ac52JA6muufv5Aq5hmbyZm6HiI3UG9/E S3WLhadcDWFA0vrOLjkLNWku02BCbOZumOEeIy+mLkW7A9OWvR0HBF0JMJQllwxMjH McwQict10AbNwzUaf3Fe4rhbDyKvQqV4mNkpdYsg1XXv1bkWDE2ZZQycNZSZHCAubw xxCEDJgzUrllwHg0O5i5LR3vB8H9ddKiEeoiX5nxByJsLox/KvefidsYGinF192ZU3 LkZva/mIcA0xQ== Date: Mon, 25 Aug 2025 19:17:02 +0300 From: Mike Rapoport To: David Hildenbrand Cc: Mika =?iso-8859-1?Q?Penttil=E4?= , linux-kernel@vger.kernel.org, Alexander Potapenko , Andrew Morton , Brendan Jackman , Christoph Lameter , Dennis Zhou , Dmitry Vyukov , dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, iommu@lists.linux.dev, io-uring@vger.kernel.org, Jason Gunthorpe , Jens Axboe , Johannes Weiner , John Hubbard , kasan-dev@googlegroups.com, kvm@vger.kernel.org, "Liam R. Howlett" , Linus Torvalds , linux-arm-kernel@axis.com, linux-arm-kernel@lists.infradead.org, linux-crypto@vger.kernel.org, linux-ide@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mips@vger.kernel.org, linux-mmc@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org, Lorenzo Stoakes , Marco Elver , Marek Szyprowski , Michal Hocko , Muchun Song , netdev@vger.kernel.org, Oscar Salvador , Peter Xu , Robin Murphy , Suren Baghdasaryan , Tejun Heo , virtualization@lists.linux.dev, Vlastimil Babka , wireguard@lists.zx2c4.com, x86@kernel.org, Zi Yan Subject: Re: [PATCH RFC 10/35] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() Message-ID: References: <20250821200701.1329277-1-david@redhat.com> <20250821200701.1329277-11-david@redhat.com> <9156d191-9ec4-4422-bae9-2e8ce66f9d5e@redhat.com> <7077e09f-6ce9-43ba-8f87-47a290680141@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 8735D2000E X-Rspam-User: X-Stat-Signature: j659nyo3jmmqcjtptx9jr5s7xwcqgqxo X-Rspamd-Server: rspam09 X-HE-Tag: 1756138641-424606 X-HE-Meta: U2FsdGVkX18YeT0GxLEnqHSqbyCh5lKg71GAsGIpxPGN5YUQc4HNrEDn1dfBNTzEXfUYQC6KFi1C1Of14dSpT+6LLLJ7IMBeNLw2f/fyv0nywQAHmh+RjzNVI9WgblLD45/pp+EAjEP/S+Up6Lx7cZ3gh+EeeNr80B5tL4T55U39DA/vW2+JPan6VORmi07w1J/gQkYPh1LLkf+IFFy8T8n2wJAqclgcqXLa0QLTMHdEJLSQMd0eOBdl1cxaik7CHkteSXweipX6aIaLM2hgyN6j9NtaVQ8mxS732FwSjNbdLLGM4fls1u2TdcVJxirING8JCcK7jg0QkK7IT3HpWJqKfLb90MjpxJ2UDXGsvt69BeM9+uTJdl0bgsTOWqGY3Yfn5rbib3csg/p1GQkDRxS5KPV+SN7aV8CQZOwk9XoEZ9Dv6MJgrzjWjxVl1Bl4synV5IuezTMFx/9iQ1J4mSlHfWH795d2MtuhRptXk6Yu2NGZJIjN6XwazuSGkXuHKYqFzflZlqE2uWomma6/7QjeEeHXXvKhZnCUez16PQGS7MXjt0lpjBTTaBT0hktvTRcaVnHFky8je9CmQ7brZuAlR0sQWOcqCeZV9hbDgAU1q+5HTpeFObFmsE1oASWuhFOwfel0elhRSTtBG3mLhOgIllh53I2mCV5ILokrAZZ0Ha6sDmL0Cv0T0nMcHibQusz/pCF3m/Fpv/oe+Tdr+VQgJGSTB6eQ5jcFBxWtGRkAvfTyt0/QwZjnxBXwrna4VsSiLJL3rNbq4MsOBd2by7D+GGwinM+HKR0fX0YNaCSE5y9tFBwa+jFqTGNgFesh3h0EMsdmP2IB1SxDvhj/qiy14SGQkc1iAuq6R5zzBleaH/L82TIc2Gdqeh7lZ1CwkwOkL96g1OFf9f1d5K15TCa6N2V/SD7xQ+DnK82PmG0DFh12HIKyNTe/ulXm/qbsJwusZZGr0EoDlQ8Y9Ui 7LYidyKg Jol19R/oLEXUZupWfxuoMJpbHHSSlvSP4ShoVTQ67KXjzwyYsun58sRIxITTa6hT7vQh1ealpFR171JXhoIef0KpUybau9SYzT1EZO610W4PJJrxpOZ2fkzQptyCiDmoFNIzHBcw1T4ezmXIGwHmBlHalV/JPBbxpC32Vnz/wlZPamNIyY6F9oU86NEMftzoQU+wNjLG0ZB6BFpxNhoz5Ym4mdixA7ff1L/FhQ03Fa4ppNCDUhhDsqhbTKhrBae8Xlf/obem20AOzTaYwCKti5XoB88luWYJRywJwgkYBCehIFwml2STuvU02c+a/Zz3Tzuum5jZ81GfSYVE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 25, 2025 at 05:42:33PM +0200, David Hildenbrand wrote: > On 25.08.25 16:59, Mike Rapoport wrote: > > On Mon, Aug 25, 2025 at 04:38:03PM +0200, David Hildenbrand wrote: > > > On 25.08.25 16:32, Mike Rapoport wrote: > > > > On Mon, Aug 25, 2025 at 02:48:58PM +0200, David Hildenbrand wrote: > > > > > On 23.08.25 10:59, Mike Rapoport wrote: > > > > > > On Fri, Aug 22, 2025 at 08:24:31AM +0200, David Hildenbrand wrote: > > > > > > > On 22.08.25 06:09, Mika Penttilä wrote: > > > > > > > > > > > > > > > > On 8/21/25 23:06, David Hildenbrand wrote: > > > > > > > > > > > > > > > > > All pages were already initialized and set to PageReserved() with a > > > > > > > > > refcount of 1 by MM init code. > > > > > > > > > > > > > > > > Just to be sure, how is this working with MEMBLOCK_RSRV_NOINIT, where MM is supposed not to > > > > > > > > initialize struct pages? > > > > > > > > > > > > > > Excellent point, I did not know about that one. > > > > > > > > > > > > > > Spotting that we don't do the same for the head page made me assume that > > > > > > > it's just a misuse of __init_single_page(). > > > > > > > > > > > > > > But the nasty thing is that we use memblock_reserved_mark_noinit() to only > > > > > > > mark the tail pages ... > > > > > > > > > > > > And even nastier thing is that when CONFIG_DEFERRED_STRUCT_PAGE_INIT is > > > > > > disabled struct pages are initialized regardless of > > > > > > memblock_reserved_mark_noinit(). > > > > > > > > > > > > I think this patch should go in before your updates: > > > > > > > > > > Shouldn't we fix this in memblock code? > > > > > > > > > > Hacking around that in the memblock_reserved_mark_noinit() user sound wrong > > > > > -- and nothing in the doc of memblock_reserved_mark_noinit() spells that > > > > > behavior out. > > > > > > > > We can surely update the docs, but unfortunately I don't see how to avoid > > > > hacking around it in hugetlb. > > > > Since it's used to optimise HVO even further to the point hugetlb open > > > > codes memmap initialization, I think it's fair that it should deal with all > > > > possible configurations. > > > > > > Remind me, why can't we support memblock_reserved_mark_noinit() when > > > CONFIG_DEFERRED_STRUCT_PAGE_INIT is disabled? > > > > When CONFIG_DEFERRED_STRUCT_PAGE_INIT is disabled we initialize the entire > > memmap early (setup_arch()->free_area_init()), and we may have a bunch of > > memblock_reserved_mark_noinit() afterwards > > Oh, you mean that we get effective memblock modifications after already > initializing the memmap. > > That sounds ... interesting :) It's memmap, not the free lists. Without deferred init, memblock is active for a while after memmap initialized and before the memory goes to the free lists. > So yeah, we have to document this for memblock_reserved_mark_noinit(). > > Is it also a problem for kexec_handover? With KHO it's also interesting, but it does not support deferred struct page init for now :) > We should do something like: > > diff --git a/mm/memblock.c b/mm/memblock.c > index 154f1d73b61f2..ed4c563d72c32 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1091,13 +1091,16 @@ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t size) > /** > * memblock_reserved_mark_noinit - Mark a reserved memory region with flag > - * MEMBLOCK_RSRV_NOINIT which results in the struct pages not being initialized > - * for this region. > + * MEMBLOCK_RSRV_NOINIT which allows for the "struct pages" corresponding > + * to this region not getting initialized, because the caller will take > + * care of it. > * @base: the base phys addr of the region > * @size: the size of the region > * > - * struct pages will not be initialized for reserved memory regions marked with > - * %MEMBLOCK_RSRV_NOINIT. > + * "struct pages" will not be initialized for reserved memory regions marked > + * with %MEMBLOCK_RSRV_NOINIT if this function is called before initialization > + * code runs. Without CONFIG_DEFERRED_STRUCT_PAGE_INIT, it is more likely > + * that this function is not effective. > * > * Return: 0 on success, -errno on failure. > */ I have a different version :) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index b96746376e17..d20d091c6343 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -40,8 +40,9 @@ extern unsigned long long max_possible_pfn; * via a driver, and never indicated in the firmware-provided memory map as * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the * kernel resource tree. - * @MEMBLOCK_RSRV_NOINIT: memory region for which struct pages are - * not initialized (only for reserved regions). + * @MEMBLOCK_RSRV_NOINIT: memory region for which struct pages don't have + * PG_Reserved set and are completely not initialized when + * %CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled (only for reserved regions). * @MEMBLOCK_RSRV_KERN: memory region that is reserved for kernel use, * either explictitly with memblock_reserve_kern() or via memblock * allocation APIs. All memblock allocations set this flag. diff --git a/mm/memblock.c b/mm/memblock.c index 154f1d73b61f..02de5ffb085b 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1091,13 +1091,15 @@ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t size) /** * memblock_reserved_mark_noinit - Mark a reserved memory region with flag - * MEMBLOCK_RSRV_NOINIT which results in the struct pages not being initialized - * for this region. + * MEMBLOCK_RSRV_NOINIT + * * @base: the base phys addr of the region * @size: the size of the region * - * struct pages will not be initialized for reserved memory regions marked with - * %MEMBLOCK_RSRV_NOINIT. + * The struct pages for the reserved regions marked %MEMBLOCK_RSRV_NOINIT will + * not have %PG_Reserved flag set. + * When %CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, setting this flags also + * completly bypasses the initialization of struct pages for this region. * * Return: 0 on success, -errno on failure. */ > Optimizing the hugetlb code could be done, but I am not sure how high > the priority is (nobody complained so far about the double init). > > -- > Cheers > > David / dhildenb > -- Sincerely yours, Mike.