All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Alex Shi <alexs@kernel.org>,
	 Alexander Gordeev <agordeev@linux.ibm.com>,
	Andreas Larsson <andreas@gaisler.com>,
	 Borislav Petkov <bp@alien8.de>, Brian Cain <bcain@kernel.org>,
	 "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	 "David S. Miller" <davem@davemloft.net>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	 David Hildenbrand <david@kernel.org>,
	Dinh Nguyen <dinguyen@kernel.org>,
	 Geert Uytterhoeven <geert@linux-m68k.org>,
	Guo Ren <guoren@kernel.org>,  Heiko Carstens <hca@linux.ibm.com>,
	Helge Deller <deller@gmx.de>, Huacai Chen <chenhuacai@kernel.org>,
	 Ingo Molnar <mingo@redhat.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	 John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
	Jonathan Corbet <corbet@lwn.net>,
	 Klara Modin <klarasmodin@gmail.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Magnus Lindholm <linmag7@gmail.com>,
	 Matt Turner <mattst88@gmail.com>,
	Max Filippov <jcmvbkbc@gmail.com>,
	 Michael Ellerman <mpe@ellerman.id.au>,
	Michal Hocko <mhocko@suse.com>, Michal Simek <monstr@monstr.eu>,
	 Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	 Palmer Dabbelt <palmer@dabbelt.com>,
	Pratyush Yadav <pratyush@kernel.org>,
	 Richard Weinberger <richard@nod.at>,
	Russell King <linux@armlinux.org.uk>,
	 Stafford Horne <shorne@gmail.com>,
	Suren Baghdasaryan <surenb@google.com>,
	 Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Vasily Gorbik <gor@linux.ibm.com>,
	Vineet Gupta <vgupta@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,  Will Deacon <will@kernel.org>,
	x86@kernel.org, linux-alpha@vger.kernel.org,
	 linux-arm-kernel@lists.infradead.org,
	linux-csky@vger.kernel.org,  linux-cxl@vger.kernel.org,
	linux-doc@vger.kernel.org,  linux-hexagon@vger.kernel.org,
	linux-kernel@vger.kernel.org,  linux-m68k@lists.linux-m68k.org,
	linux-mips@vger.kernel.org,  linux-mm@kvack.org,
	linux-openrisc@vger.kernel.org,  linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org,  linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org,  linux-snps-arc@lists.infradead.org,
	linux-um@lists.infradead.org,  linuxppc-dev@lists.ozlabs.org,
	loongarch@lists.linux.dev,  sparclinux@vger.kernel.org
Subject: Re: [PATCH v3 24/29] arch, mm: consolidate initialization of SPARSE memory model
Date: Wed, 25 Feb 2026 23:08:38 +0530	[thread overview]
Message-ID: <87seaohgf5.ritesh.list@gmail.com> (raw)
In-Reply-To: <aZ8idANginXzhf0_@kernel.org>

Mike Rapoport <rppt@kernel.org> writes:

> Hello Ritesh,
>
> On Wed, Feb 25, 2026 at 09:00:35AM +0530, Ritesh Harjani wrote:
>> Mike Rapoport <rppt@kernel.org> writes:
>> 
>> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>> >
>> > Every architecture calls sparse_init() during setup_arch() although the
>> > data structures created by sparse_init() are not used until the
>> > initialization of the core MM.
>> >
>> > Beside the code duplication, calling sparse_init() from architecture
>> > specific code causes ordering differences of vmemmap and HVO initialization
>> > on different architectures.
>> >
>> > Move the call to sparse_init() from architecture specific code to
>> > free_area_init() to ensure that vmemmap and HVO initialization order is
>> > always the same.
>> >
>> 
>> Hello Mike,
>> 
>> [    0.000000][    T0] ------------[ cut here ]------------
>> [    0.000000][    T0] WARNING: arch/powerpc/include/asm/io.h:879 at virt_to_phys+0x44/0x1b8, CPU#0: swapper/0
>> [    0.000000][    T0] Modules linked in:
>> [    0.000000][    T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.19.0-12139-gc57b1c00145a #31 PREEMPT
>> [    0.000000][    T0] Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,git-ee03ae pSeries
>> [    0.000000][    T0] NIP:  c000000000601584 LR: c000000004075de4 CTR: c000000000601548
>> [    0.000000][    T0] REGS: c000000004d1f870 TRAP: 0700   Not tainted  (6.19.0-12139-gc57b1c00145a)
>> [    0.000000][    T0] MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 48022448  XER: 20040000
>> [    0.000000][    T0] CFAR: c0000000006016c4 IRQMASK: 1
>> [    0.000000][    T0] GPR00: c000000004075dd4 c000000004d1fb10 c00000000304bb00 c000000180000000
>> [    0.000000][    T0] GPR04: 0000000000000009 0000000000000009 c000000004ec94a0 0000000000000000
>> [    0.000000][    T0] GPR08: 0000000000018000 0000000000000001 c000000004921280 0000000048022448
>> [    0.000000][    T0] GPR12: c000000000601548 c000000004fe0000 0000000000000004 0000000000000004
>> [    0.000000][    T0] GPR16: 000000000287fb08 0000000000000060 0000000000000002 0000000002831750
>> [    0.000000][    T0] GPR20: 0000000002831778 fffffffffffffffd c000000004d78050 00000000051cbb00
>> [    0.000000][    T0] GPR24: 0000000005a40008 c000000000000000 c000000000400000 0000000000000100
>> [    0.000000][    T0] GPR28: c000000004d78050 0000000000000000 c000000004ecd4a8 0000000000000001
>> [    0.000000][    T0] NIP [c000000000601584] virt_to_phys+0x44/0x1b8
>> [    0.000000][    T0] LR [c000000004075de4] alloc_bootmem+0x144/0x1a8
>> [    0.000000][    T0] Call Trace:
>> [    0.000000][    T0] [c000000004d1fb50] [c000000004075dd4] alloc_bootmem+0x134/0x1a8
>> [    0.000000][    T0] [c000000004d1fba0] [c000000004075fac] __alloc_bootmem_huge_page+0x164/0x230
>> [    0.000000][    T0] [c000000004d1fbe0] [c000000004030bc4] alloc_bootmem_huge_page+0x44/0x138
>> [    0.000000][    T0] [c000000004d1fc10] [c000000004076e48] hugetlb_hstate_alloc_pages+0x350/0x5ac
>> [    0.000000][    T0] [c000000004d1fd30] [c0000000040782f0] hugetlb_bootmem_alloc+0x15c/0x19c
>> [    0.000000][    T0] [c000000004d1fd70] [c00000000406d7b4] mm_core_init_early+0x7c/0xdf4
>> [    0.000000][    T0] [c000000004d1ff30] [c000000004011d84] start_kernel+0xac/0xc58
>> [    0.000000][    T0] [c000000004d1ffe0] [c00000000000e99c] start_here_common+0x1c/0x20
>> [    0.000000][    T0] Code: 6129ffff 792907c6 6529ffff 6129ffff 7c234840 40810018 3d2201e8 3929a7a8 e9290000 7c291840 41810044 3be00001 <0b1f0000> 3d20bfff 6129ffff 792907c6
>> 
>> 
>> I think this is happening because, now in mm_core_early_init(), the
>> order of initialization between hugetlb_bootmem_alloc() and
>> free_area_init() is reversed. Since free_area_init() -> sparse_init()
>> is responsible for setting SECTIONS and vmemmap area. 
>> 
>> Then in alloc_bootmem() (from hugetlb_bootmem_alloc() path), it uses virt_to_phys(m)...
>> 
>> 			/*
>> 			 * For pre-HVO to work correctly, pages need to be on
>> 			 * the list for the node they were actually allocated
>> 			 * from. That node may be different in the case of
>> 			 * fallback by memblock_alloc_try_nid_raw. So,
>> 			 * extract the actual node first.
>> 			 */
>> 			if (m)
>> 				listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m)));
>> 
>> 
>> ... virt_to_phys on powerpc uses:
>> 
>> static inline unsigned long virt_to_phys(const volatile void * address)
>> {
>> 	WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && !virt_addr_valid(address));
>> 
>> 	return __pa((unsigned long)address);
>> }
>> 
>> #define virt_addr_valid(vaddr)	({					\
>> 	unsigned long _addr = (unsigned long)vaddr;			\
>> 	_addr >= PAGE_OFFSET && _addr < (unsigned long)high_memory &&	\
>> 	pfn_valid(virt_to_pfn((void *)_addr));				\
>> })
>> 
>> 
>> I think the above warning in dmesg gets printed from above WARN_ON, i.e.
>> because pfn_valid() is false, since we haven't done sparse_init() yet.
>
> Yes, I agree.
>  
>> So, what I wanted to check was - do you think instead of virt_to_phys(), we
>> could directly use __pa() here() in mm/hugetlb.c, since these are
>> memblock alloc addresses? i.e.: 
>> 
>> // alloc_bootmem():
>> -   listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m)));
>> +   listnode = early_pfn_to_nid(PHYS_PFN(__pa(m)));
>> 
>> // __alloc_bootmem_huge_page():
>> -   memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE),
>> +   memblock_reserved_mark_noinit(__pa((void *)m + PAGE_SIZE),
>
> It surely will work for powerpc :)
> I checked the definitions of __pa() on other architectures and it seems the
> safest and the easiest way to fix this.
>  
> Would you send a formal patch?
>

Thanks Mike for taking a look at above and confirming. Sure, let me
prepare the patch and send it by tomorrow. 

-ritesh

WARNING: multiple messages have this Message-ID (diff)
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Alex Shi <alexs@kernel.org>,
	 Alexander Gordeev <agordeev@linux.ibm.com>,
	Andreas Larsson <andreas@gaisler.com>,
	 Borislav Petkov <bp@alien8.de>, Brian Cain <bcain@kernel.org>,
	 "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	 "David S. Miller" <davem@davemloft.net>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	 David Hildenbrand <david@kernel.org>,
	Dinh Nguyen <dinguyen@kernel.org>,
	 Geert Uytterhoeven <geert@linux-m68k.org>,
	Guo Ren <guoren@kernel.org>,  Heiko Carstens <hca@linux.ibm.com>,
	Helge Deller <deller@gmx.de>, Huacai Chen <chenhuacai@kernel.org>,
	 Ingo Molnar <mingo@redhat.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	 John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
	Jonathan Corbet <corbet@lwn.net>,
	 Klara Modin <klarasmodin@gmail.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Magnus Lindholm <linmag7@gmail.com>,
	 Matt Turner <mattst88@gmail.com>,
	Max Filippov <jcmvbkbc@gmail.com>,
	 Michael Ellerman <mpe@ellerman.id.au>,
	Michal Hocko <mhocko@suse.com>, Michal Simek <monstr@monstr.eu>,
	 Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	 Palmer Dabbelt <palmer@dabbelt.com>,
	Pratyush Yadav <pratyush@kernel.org>,
	 Richard Weinberger <richard@nod.at>,
	Russell King <linux@armlinux.org.uk>,
	 Stafford Horne <shorne@gmail.com>,
	Suren Baghdasaryan <surenb@google.com>,
	 Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Vasily Gorbik <gor@linux.ibm.com>,
	Vineet Gupta <vgupta@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,  Will Deacon <will@kernel.org>,
	x86@kernel.org, linux-alpha@vger.kernel.org,
	 linux-arm-kernel@lists.infradead.org,
	linux-csky@vger.kernel.org,  linux-cxl@vger.kernel.org,
	linux-doc@vger.kernel.org,  linux-hexagon@vger.kernel.org,
	linux-kernel@vger.kernel.org,  linux-m68k@lists.linux-m68k.org,
	linux-mips@vger.kernel.org,  linux-mm@kvack.org,
	linux-openrisc@vger.kernel.org,  linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org,  linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org,  linux-snps-arc@lists.infradead.org,
	linux-um@lists.infradead.org,  linuxppc-dev@lists.ozlabs.org,
	loongarch@lists.linux.dev,  sparclinux@vger.kernel.org
Subject: Re: [PATCH v3 24/29] arch, mm: consolidate initialization of SPARSE memory model
Date: Wed, 25 Feb 2026 23:08:38 +0530	[thread overview]
Message-ID: <87seaohgf5.ritesh.list@gmail.com> (raw)
In-Reply-To: <aZ8idANginXzhf0_@kernel.org>

Mike Rapoport <rppt@kernel.org> writes:

> Hello Ritesh,
>
> On Wed, Feb 25, 2026 at 09:00:35AM +0530, Ritesh Harjani wrote:
>> Mike Rapoport <rppt@kernel.org> writes:
>> 
>> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>> >
>> > Every architecture calls sparse_init() during setup_arch() although the
>> > data structures created by sparse_init() are not used until the
>> > initialization of the core MM.
>> >
>> > Beside the code duplication, calling sparse_init() from architecture
>> > specific code causes ordering differences of vmemmap and HVO initialization
>> > on different architectures.
>> >
>> > Move the call to sparse_init() from architecture specific code to
>> > free_area_init() to ensure that vmemmap and HVO initialization order is
>> > always the same.
>> >
>> 
>> Hello Mike,
>> 
>> [    0.000000][    T0] ------------[ cut here ]------------
>> [    0.000000][    T0] WARNING: arch/powerpc/include/asm/io.h:879 at virt_to_phys+0x44/0x1b8, CPU#0: swapper/0
>> [    0.000000][    T0] Modules linked in:
>> [    0.000000][    T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.19.0-12139-gc57b1c00145a #31 PREEMPT
>> [    0.000000][    T0] Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,git-ee03ae pSeries
>> [    0.000000][    T0] NIP:  c000000000601584 LR: c000000004075de4 CTR: c000000000601548
>> [    0.000000][    T0] REGS: c000000004d1f870 TRAP: 0700   Not tainted  (6.19.0-12139-gc57b1c00145a)
>> [    0.000000][    T0] MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 48022448  XER: 20040000
>> [    0.000000][    T0] CFAR: c0000000006016c4 IRQMASK: 1
>> [    0.000000][    T0] GPR00: c000000004075dd4 c000000004d1fb10 c00000000304bb00 c000000180000000
>> [    0.000000][    T0] GPR04: 0000000000000009 0000000000000009 c000000004ec94a0 0000000000000000
>> [    0.000000][    T0] GPR08: 0000000000018000 0000000000000001 c000000004921280 0000000048022448
>> [    0.000000][    T0] GPR12: c000000000601548 c000000004fe0000 0000000000000004 0000000000000004
>> [    0.000000][    T0] GPR16: 000000000287fb08 0000000000000060 0000000000000002 0000000002831750
>> [    0.000000][    T0] GPR20: 0000000002831778 fffffffffffffffd c000000004d78050 00000000051cbb00
>> [    0.000000][    T0] GPR24: 0000000005a40008 c000000000000000 c000000000400000 0000000000000100
>> [    0.000000][    T0] GPR28: c000000004d78050 0000000000000000 c000000004ecd4a8 0000000000000001
>> [    0.000000][    T0] NIP [c000000000601584] virt_to_phys+0x44/0x1b8
>> [    0.000000][    T0] LR [c000000004075de4] alloc_bootmem+0x144/0x1a8
>> [    0.000000][    T0] Call Trace:
>> [    0.000000][    T0] [c000000004d1fb50] [c000000004075dd4] alloc_bootmem+0x134/0x1a8
>> [    0.000000][    T0] [c000000004d1fba0] [c000000004075fac] __alloc_bootmem_huge_page+0x164/0x230
>> [    0.000000][    T0] [c000000004d1fbe0] [c000000004030bc4] alloc_bootmem_huge_page+0x44/0x138
>> [    0.000000][    T0] [c000000004d1fc10] [c000000004076e48] hugetlb_hstate_alloc_pages+0x350/0x5ac
>> [    0.000000][    T0] [c000000004d1fd30] [c0000000040782f0] hugetlb_bootmem_alloc+0x15c/0x19c
>> [    0.000000][    T0] [c000000004d1fd70] [c00000000406d7b4] mm_core_init_early+0x7c/0xdf4
>> [    0.000000][    T0] [c000000004d1ff30] [c000000004011d84] start_kernel+0xac/0xc58
>> [    0.000000][    T0] [c000000004d1ffe0] [c00000000000e99c] start_here_common+0x1c/0x20
>> [    0.000000][    T0] Code: 6129ffff 792907c6 6529ffff 6129ffff 7c234840 40810018 3d2201e8 3929a7a8 e9290000 7c291840 41810044 3be00001 <0b1f0000> 3d20bfff 6129ffff 792907c6
>> 
>> 
>> I think this is happening because, now in mm_core_early_init(), the
>> order of initialization between hugetlb_bootmem_alloc() and
>> free_area_init() is reversed. Since free_area_init() -> sparse_init()
>> is responsible for setting SECTIONS and vmemmap area. 
>> 
>> Then in alloc_bootmem() (from hugetlb_bootmem_alloc() path), it uses virt_to_phys(m)...
>> 
>> 			/*
>> 			 * For pre-HVO to work correctly, pages need to be on
>> 			 * the list for the node they were actually allocated
>> 			 * from. That node may be different in the case of
>> 			 * fallback by memblock_alloc_try_nid_raw. So,
>> 			 * extract the actual node first.
>> 			 */
>> 			if (m)
>> 				listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m)));
>> 
>> 
>> ... virt_to_phys on powerpc uses:
>> 
>> static inline unsigned long virt_to_phys(const volatile void * address)
>> {
>> 	WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && !virt_addr_valid(address));
>> 
>> 	return __pa((unsigned long)address);
>> }
>> 
>> #define virt_addr_valid(vaddr)	({					\
>> 	unsigned long _addr = (unsigned long)vaddr;			\
>> 	_addr >= PAGE_OFFSET && _addr < (unsigned long)high_memory &&	\
>> 	pfn_valid(virt_to_pfn((void *)_addr));				\
>> })
>> 
>> 
>> I think the above warning in dmesg gets printed from above WARN_ON, i.e.
>> because pfn_valid() is false, since we haven't done sparse_init() yet.
>
> Yes, I agree.
>  
>> So, what I wanted to check was - do you think instead of virt_to_phys(), we
>> could directly use __pa() here() in mm/hugetlb.c, since these are
>> memblock alloc addresses? i.e.: 
>> 
>> // alloc_bootmem():
>> -   listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m)));
>> +   listnode = early_pfn_to_nid(PHYS_PFN(__pa(m)));
>> 
>> // __alloc_bootmem_huge_page():
>> -   memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE),
>> +   memblock_reserved_mark_noinit(__pa((void *)m + PAGE_SIZE),
>
> It surely will work for powerpc :)
> I checked the definitions of __pa() on other architectures and it seems the
> safest and the easiest way to fix this.
>  
> Would you send a formal patch?
>

Thanks Mike for taking a look at above and confirming. Sure, let me
prepare the patch and send it by tomorrow. 

-ritesh

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Alex Shi <alexs@kernel.org>,
	 Alexander Gordeev <agordeev@linux.ibm.com>,
	Andreas Larsson <andreas@gaisler.com>,
	 Borislav Petkov <bp@alien8.de>, Brian Cain <bcain@kernel.org>,
	 "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	 "David S. Miller" <davem@davemloft.net>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	 David Hildenbrand <david@kernel.org>,
	Dinh Nguyen <dinguyen@kernel.org>,
	 Geert Uytterhoeven <geert@linux-m68k.org>,
	Guo Ren <guoren@kernel.org>,  Heiko Carstens <hca@linux.ibm.com>,
	Helge Deller <deller@gmx.de>, Huacai Chen <chenhuacai@kernel.org>,
	 Ingo Molnar <mingo@redhat.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	 John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
	Jonathan Corbet <corbet@lwn.net>,
	 Klara Modin <klarasmodin@gmail.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Magnus Lindholm <linmag7@gmail.com>,
	 Matt Turner <mattst88@gmail.com>,
	Max Filippov <jcmvbkbc@gmail.com>,
	 Michael Ellerman <mpe@ellerman.id.au>,
	Michal Hocko <mhocko@suse.com>, Michal Simek <monstr@monstr.eu>,
	 Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	 Palmer Dabbelt <palmer@dabbelt.com>,
	Pratyush Yadav <pratyush@kernel.org>,
	 Richard Weinberger <richard@nod.at>,
	Russell King <linux@armlinux.org.uk>,
	 Stafford Horne <shorne@gmail.com>,
	Suren Baghdasaryan <surenb@google.com>,
	 Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Vasily Gorbik <gor@linux.ibm.com>,
	Vineet Gupta <vgupta@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,  Will Deacon <will@kernel.org>,
	x86@kernel.org, linux-alpha@vger.kernel.org,
	 linux-arm-kernel@lists.infradead.org,
	linux-csky@vger.kernel.org,  linux-cxl@vger.kernel.org,
	linux-doc@vger.kernel.org,  linux-hexagon@vger.kernel.org,
	linux-kernel@vger.kernel.org,  linux-m68k@lists.linux-m68k.org,
	linux-mips@vger.kernel.org,  linux-mm@kvack.org,
	linux-openrisc@vger.kernel.org,  linux-parisc@vger.kernel.org,
	linux-riscv@lists.infradead.org,  linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org,  linux-snps-arc@lists.infradead.org,
	linux-um@lists.infradead.org,  linuxppc-dev@lists.ozlabs.org,
	loongarch@lists.linux.dev,  sparclinux@vger.kernel.org
Subject: Re: [PATCH v3 24/29] arch, mm: consolidate initialization of SPARSE memory model
Date: Wed, 25 Feb 2026 23:08:38 +0530	[thread overview]
Message-ID: <87seaohgf5.ritesh.list@gmail.com> (raw)
In-Reply-To: <aZ8idANginXzhf0_@kernel.org>

Mike Rapoport <rppt@kernel.org> writes:

> Hello Ritesh,
>
> On Wed, Feb 25, 2026 at 09:00:35AM +0530, Ritesh Harjani wrote:
>> Mike Rapoport <rppt@kernel.org> writes:
>> 
>> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>> >
>> > Every architecture calls sparse_init() during setup_arch() although the
>> > data structures created by sparse_init() are not used until the
>> > initialization of the core MM.
>> >
>> > Beside the code duplication, calling sparse_init() from architecture
>> > specific code causes ordering differences of vmemmap and HVO initialization
>> > on different architectures.
>> >
>> > Move the call to sparse_init() from architecture specific code to
>> > free_area_init() to ensure that vmemmap and HVO initialization order is
>> > always the same.
>> >
>> 
>> Hello Mike,
>> 
>> [    0.000000][    T0] ------------[ cut here ]------------
>> [    0.000000][    T0] WARNING: arch/powerpc/include/asm/io.h:879 at virt_to_phys+0x44/0x1b8, CPU#0: swapper/0
>> [    0.000000][    T0] Modules linked in:
>> [    0.000000][    T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.19.0-12139-gc57b1c00145a #31 PREEMPT
>> [    0.000000][    T0] Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,git-ee03ae pSeries
>> [    0.000000][    T0] NIP:  c000000000601584 LR: c000000004075de4 CTR: c000000000601548
>> [    0.000000][    T0] REGS: c000000004d1f870 TRAP: 0700   Not tainted  (6.19.0-12139-gc57b1c00145a)
>> [    0.000000][    T0] MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 48022448  XER: 20040000
>> [    0.000000][    T0] CFAR: c0000000006016c4 IRQMASK: 1
>> [    0.000000][    T0] GPR00: c000000004075dd4 c000000004d1fb10 c00000000304bb00 c000000180000000
>> [    0.000000][    T0] GPR04: 0000000000000009 0000000000000009 c000000004ec94a0 0000000000000000
>> [    0.000000][    T0] GPR08: 0000000000018000 0000000000000001 c000000004921280 0000000048022448
>> [    0.000000][    T0] GPR12: c000000000601548 c000000004fe0000 0000000000000004 0000000000000004
>> [    0.000000][    T0] GPR16: 000000000287fb08 0000000000000060 0000000000000002 0000000002831750
>> [    0.000000][    T0] GPR20: 0000000002831778 fffffffffffffffd c000000004d78050 00000000051cbb00
>> [    0.000000][    T0] GPR24: 0000000005a40008 c000000000000000 c000000000400000 0000000000000100
>> [    0.000000][    T0] GPR28: c000000004d78050 0000000000000000 c000000004ecd4a8 0000000000000001
>> [    0.000000][    T0] NIP [c000000000601584] virt_to_phys+0x44/0x1b8
>> [    0.000000][    T0] LR [c000000004075de4] alloc_bootmem+0x144/0x1a8
>> [    0.000000][    T0] Call Trace:
>> [    0.000000][    T0] [c000000004d1fb50] [c000000004075dd4] alloc_bootmem+0x134/0x1a8
>> [    0.000000][    T0] [c000000004d1fba0] [c000000004075fac] __alloc_bootmem_huge_page+0x164/0x230
>> [    0.000000][    T0] [c000000004d1fbe0] [c000000004030bc4] alloc_bootmem_huge_page+0x44/0x138
>> [    0.000000][    T0] [c000000004d1fc10] [c000000004076e48] hugetlb_hstate_alloc_pages+0x350/0x5ac
>> [    0.000000][    T0] [c000000004d1fd30] [c0000000040782f0] hugetlb_bootmem_alloc+0x15c/0x19c
>> [    0.000000][    T0] [c000000004d1fd70] [c00000000406d7b4] mm_core_init_early+0x7c/0xdf4
>> [    0.000000][    T0] [c000000004d1ff30] [c000000004011d84] start_kernel+0xac/0xc58
>> [    0.000000][    T0] [c000000004d1ffe0] [c00000000000e99c] start_here_common+0x1c/0x20
>> [    0.000000][    T0] Code: 6129ffff 792907c6 6529ffff 6129ffff 7c234840 40810018 3d2201e8 3929a7a8 e9290000 7c291840 41810044 3be00001 <0b1f0000> 3d20bfff 6129ffff 792907c6
>> 
>> 
>> I think this is happening because, now in mm_core_early_init(), the
>> order of initialization between hugetlb_bootmem_alloc() and
>> free_area_init() is reversed. Since free_area_init() -> sparse_init()
>> is responsible for setting SECTIONS and vmemmap area. 
>> 
>> Then in alloc_bootmem() (from hugetlb_bootmem_alloc() path), it uses virt_to_phys(m)...
>> 
>> 			/*
>> 			 * For pre-HVO to work correctly, pages need to be on
>> 			 * the list for the node they were actually allocated
>> 			 * from. That node may be different in the case of
>> 			 * fallback by memblock_alloc_try_nid_raw. So,
>> 			 * extract the actual node first.
>> 			 */
>> 			if (m)
>> 				listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m)));
>> 
>> 
>> ... virt_to_phys on powerpc uses:
>> 
>> static inline unsigned long virt_to_phys(const volatile void * address)
>> {
>> 	WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && !virt_addr_valid(address));
>> 
>> 	return __pa((unsigned long)address);
>> }
>> 
>> #define virt_addr_valid(vaddr)	({					\
>> 	unsigned long _addr = (unsigned long)vaddr;			\
>> 	_addr >= PAGE_OFFSET && _addr < (unsigned long)high_memory &&	\
>> 	pfn_valid(virt_to_pfn((void *)_addr));				\
>> })
>> 
>> 
>> I think the above warning in dmesg gets printed from above WARN_ON, i.e.
>> because pfn_valid() is false, since we haven't done sparse_init() yet.
>
> Yes, I agree.
>  
>> So, what I wanted to check was - do you think instead of virt_to_phys(), we
>> could directly use __pa() here() in mm/hugetlb.c, since these are
>> memblock alloc addresses? i.e.: 
>> 
>> // alloc_bootmem():
>> -   listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m)));
>> +   listnode = early_pfn_to_nid(PHYS_PFN(__pa(m)));
>> 
>> // __alloc_bootmem_huge_page():
>> -   memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE),
>> +   memblock_reserved_mark_noinit(__pa((void *)m + PAGE_SIZE),
>
> It surely will work for powerpc :)
> I checked the definitions of __pa() on other architectures and it seems the
> safest and the easiest way to fix this.
>  
> Would you send a formal patch?
>

Thanks Mike for taking a look at above and confirming. Sure, let me
prepare the patch and send it by tomorrow. 

-ritesh

_______________________________________________
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc

  reply	other threads:[~2026-02-25 17:41 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-11  8:20 [PATCH v3 00/29] arch, mm: consolidate hugetlb early reservation Mike Rapoport
2026-01-11  8:20 ` Mike Rapoport
2026-01-11  8:20 ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 01/29] alpha: introduce arch_zone_limits_init() Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 02/29] arc: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 03/29] arm: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 04/29] arm: make initialization of zero page independent of the memory map Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 05/29] arm64: introduce arch_zone_limits_init() Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 06/29] csky: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 07/29] hexagon: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 08/29] loongarch: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 09/29] m68k: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 10/29] microblaze: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 11/29] mips: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 12/29] nios2: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 13/29] openrisc: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 14/29] parisc: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 15/29] powerpc: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-13 12:29   ` Ritesh Harjani
2026-01-13 12:29     ` Ritesh Harjani
2026-01-13 12:29     ` Ritesh Harjani
2026-01-11  8:20 ` [PATCH v3 16/29] riscv: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 17/29] s390: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-12  7:02   ` Alexander Gordeev
2026-01-12  7:02     ` Alexander Gordeev
2026-01-12  7:02     ` Alexander Gordeev
2026-01-12  7:34     ` Mike Rapoport
2026-01-12  7:34       ` Mike Rapoport
2026-01-12  7:34       ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 18/29] sh: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 19/29] sparc: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-13 12:28   ` Andreas Larsson
2026-01-13 12:28     ` Andreas Larsson
2026-01-13 12:28     ` Andreas Larsson
2026-01-11  8:20 ` [PATCH v3 20/29] um: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 21/29] x86: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 22/29] xtensa: " Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 23/29] arch, mm: consolidate initialization of nodes, zones and memory map Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-02-27 15:14   ` Vlastimil Babka
2026-02-27 15:14     ` Vlastimil Babka
2026-02-27 15:14     ` Vlastimil Babka
2026-02-27 20:31     ` Mike Rapoport
2026-02-27 20:31       ` Mike Rapoport
2026-02-27 20:31       ` Mike Rapoport
2026-01-11  8:20 ` [PATCH v3 24/29] arch, mm: consolidate initialization of SPARSE memory model Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-02-23 13:52   ` Thomas Weißschuh
2026-02-23 13:52     ` Thomas Weißschuh
2026-02-23 19:40     ` Mike Rapoport
2026-02-23 19:40       ` Mike Rapoport
2026-03-09  7:34       ` [BUG] SPARSEMEM broken on RISC-V; was: [PATCH] " Thomas Weißschuh
2026-03-09  7:34         ` Thomas Weißschuh
2026-03-10  4:04         ` Vivian Wang
2026-03-10  4:04           ` Vivian Wang
2026-02-25  3:30   ` [PATCH v3 24/29] " Ritesh Harjani
2026-02-25  3:30     ` Ritesh Harjani
2026-02-25  3:30     ` Ritesh Harjani
2026-02-25 16:25     ` Mike Rapoport
2026-02-25 16:25       ` Mike Rapoport
2026-02-25 16:25       ` Mike Rapoport
2026-02-25 17:38       ` Ritesh Harjani [this message]
2026-02-25 17:38         ` Ritesh Harjani
2026-02-25 17:38         ` Ritesh Harjani
2026-01-11  8:20 ` [PATCH v3 25/29] mips: drop paging_init() Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:20   ` Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 26/29] x86: don't reserve hugetlb memory in setup_arch() Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 27/29] mm, arch: consolidate hugetlb CMA reservation Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 28/29] mm/hugetlb: drop hugetlb_cma_check() Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-11  8:21 ` [PATCH v3 29/29] Revert "mm/hugetlb: deal with multiple calls to hugetlb_bootmem_alloc" Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-11  8:21   ` Mike Rapoport
2026-01-12 22:23 ` [PATCH v3 00/29] arch, mm: consolidate hugetlb early reservation Andrew Morton
2026-01-12 22:23   ` Andrew Morton
2026-01-12 22:23   ` Andrew Morton
2026-01-13  6:50   ` Kalle Niemi
2026-01-13  6:50     ` Kalle Niemi
2026-01-13  6:50     ` Kalle Niemi
2026-01-13  8:40     ` Kalle Niemi
2026-01-13  8:40       ` Kalle Niemi
2026-01-13  8:40       ` Kalle Niemi
2026-02-20  4:10 ` patchwork-bot+linux-riscv
2026-02-20  4:10   ` patchwork-bot+linux-riscv
2026-02-20  4:10   ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87seaohgf5.ritesh.list@gmail.com \
    --to=ritesh.list@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexs@kernel.org \
    --cc=andreas@gaisler.com \
    --cc=bcain@kernel.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=chleroy@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=david@kernel.org \
    --cc=deller@gmx.de \
    --cc=dinguyen@kernel.org \
    --cc=geert@linux-m68k.org \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=gor@linux.ibm.com \
    --cc=guoren@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=johannes@sipsolutions.net \
    --cc=klarasmodin@gmail.com \
    --cc=linmag7@gmail.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-openrisc@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux-um@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mattst88@gmail.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=monstr@monstr.eu \
    --cc=mpe@ellerman.id.au \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=palmer@dabbelt.com \
    --cc=pratyush@kernel.org \
    --cc=richard@nod.at \
    --cc=rppt@kernel.org \
    --cc=shorne@gmail.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@suse.cz \
    --cc=vgupta@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.