All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Nathan Chancellor <nathan@kernel.org>
Cc: Usama Arif <usama.arif@bytedance.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, muchun.song@linux.dev,
	songmuchun@bytedance.com, fam.zheng@bytedance.com,
	liangma@liangbit.com, punit.agrawal@bytedance.com,
	Konrad Dybcio <konrad.dybcio@linaro.org>,
	llvm@lists.linux.dev
Subject: Re: [PATCH] mm: hugetlb: Only prep and add allocated folios for non-gigantic pages
Date: Thu, 12 Oct 2023 17:12:03 -0700	[thread overview]
Message-ID: <20231013001203.GA3812@monkey> (raw)
In-Reply-To: <20231012145318.GA5127@monkey>

[-- Attachment #1: Type: text/plain, Size: 2825 bytes --]

On 10/12/23 07:53, Mike Kravetz wrote:
> On 10/11/23 17:03, Nathan Chancellor wrote:
> > On Mon, Oct 09, 2023 at 06:23:45PM -0700, Mike Kravetz wrote:
> > > On 10/09/23 15:56, Usama Arif wrote:
> > 
> > I suspect the crash that our continuous integration spotted [1] is the
> > same issue that Konrad is seeing, as I have bisected that failure to
> > bfb41d6b2fe1 in next-20231009. However, neither the first half of your
> > diff (since the second half does not apply at bfb41d6b2fe1) nor the
> > original patch in this thread resolves the issue though, so maybe it is
> > entirely different from Konrad's?
> > 
> > For what it's worth, this issue is only visible for me when building for
> > arm64 using LLVM with CONFIG_INIT_STACK_NONE=y, instead of the default
> > CONFIG_INIT_STACK_ALL_ZERO=y (which appears to hide the problem?),
> > making it seem like it could be something with uninitialized memory... I
> > have not been able to reproduce it with GCC, which could also mean
> > something.
> 
> Thank you Nathan!  That is very helpful.
> 
> I will use this information to try and recreate.  If I can recreate, I
> should be able to get to root cause.

I could easily recreate the issue using the provided instructions.  First
thing I did was add a few printk's to check/verify state.  The beginning
of gather_bootmem_prealloc looked like this:

static void __init gather_bootmem_prealloc(void)
{
	LIST_HEAD(folio_list);
	struct huge_bootmem_page *m;
	struct hstate *h, *prev_h = NULL;

	if (list_empty(&huge_boot_pages))
		printk("gather_bootmem_prealloc: huge_boot_pages list empty\n");

	list_for_each_entry(m, &huge_boot_pages, list) {
		struct page *page = virt_to_page(m);
		struct folio *folio = (void *)page;

		printk("gather_bootmem_prealloc: loop entry m %lx\n",
							(unsigned long)m);

The STRANGE thing is that the printk after testing for list_empty would
print, then we would enter the 'list_for_each_entry()' loop as if the list
was not empty.  This is the cause of the addressing exception.  m pointed
to the list head as opposed to an entry on the list.

I have attached disassembly of gather_bootmem_prealloc with INIT_STACK_NONE
and INIT_STACK_ALL_ZERO.  disassembly listings are for code without
printks.

This is the first time I have looked at arm assembly, so I may be missing
something.  However, in the INIT_STACK_NONE case it looks like we get the
address of huge_boot_pages into a register but do not use it to determine
if we should execute the loop.  Code generated with INIT_STACK_ALL_ZERO seems
to show code checking the list before entering the loop.

Can someone with more arm assembly experience take a quick look?  Since
huge_boot_pages is a global variable rather than on the stack, I can't
see how INIT_STACK_ALL_ZERO/INIT_STACK_NONE could make a difference.
-- 
Mike Kravetz

[-- Attachment #2: disass_INIT_STACK_NONE --]
[-- Type: text/plain, Size: 9882 bytes --]

Dump of assembler code for function gather_bootmem_prealloc:
mm/hugetlb.c:
3292	{
   0xffff800081ae0f08 <+0>:	d503233f	paciasp
   0xffff800081ae0f0c <+4>:	d10203ff	sub	sp, sp, #0x80
   0xffff800081ae0f10 <+8>:	a9027bfd	stp	x29, x30, [sp, #32]
   0xffff800081ae0f14 <+12>:	a9036ffc	stp	x28, x27, [sp, #48]
   0xffff800081ae0f18 <+16>:	a90467fa	stp	x26, x25, [sp, #64]
   0xffff800081ae0f1c <+20>:	a9055ff8	stp	x24, x23, [sp, #80]
   0xffff800081ae0f20 <+24>:	a90657f6	stp	x22, x21, [sp, #96]
   0xffff800081ae0f24 <+28>:	a9074ff4	stp	x20, x19, [sp, #112]
   0xffff800081ae0f28 <+32>:	910083fd	add	x29, sp, #0x20
   0xffff800081ae0f2c <+36>:	d5384108	mrs	x8, sp_el0

3294		struct huge_bootmem_page *m;
3295		struct hstate *h, *prev_h = NULL;
3296	
3297		list_for_each_entry(m, &huge_boot_pages, list) {
   0xffff800081ae0f30 <+40>:	f00007a9	adrp	x9, 0xffff800081bd7000 <new_log_buf_len>
   0xffff800081ae0f34 <+44>:	f9423d08	ldr	x8, [x8, #1144]
   0xffff800081ae0f38 <+48>:	aa1f03e0	mov	x0, xzr
   0xffff800081ae0f3c <+52>:	910023f5	add	x21, sp, #0x8
   0xffff800081ae0f40 <+56>:	d2e00036	mov	x22, #0x1000000000000       	// #281474976710656
   0xffff800081ae0f44 <+60>:	b25657f7	mov	x23, #0xfffffc0000000000    	// #-4398046511104
   0xffff800081ae0f48 <+64>:	52802018	mov	w24, #0x100                 	// #256
   0xffff800081ae0f4c <+68>:	f81f83a8	stur	x8, [x29, #-8]
   0xffff800081ae0f50 <+72>:	5280003a	mov	w26, #0x1                   	// #1
   0xffff800081ae0f54 <+76>:	f946dd39	ldr	x25, [x9, #3512]
   0xffff800081ae0f58 <+80>:	d503201f	nop
   0xffff800081ae0f5c <+84>:	107b72fb	adr	x27, 0xffff800081bd7db8 <huge_boot_pages>

3293		LIST_HEAD(folio_list);
   0xffff800081ae0f60 <+88>:	a900d7f5	stp	x21, x21, [sp, #8]

3298			struct page *page = virt_to_page(m);
3299			struct folio *folio = (void *)page;
3300	
3301			h = m->hstate;
   0xffff800081ae0f64 <+92>:	f9400b33	ldr	x19, [x25, #16]

3302			/*
3303			 * It is possible to have multiple huge page sizes (hstates)
3304			 * in this list.  If so, process each size separately.
3305			 */
3306			if (h != prev_h && prev_h != NULL)
   0xffff800081ae0f68 <+96>:	b40000a0	cbz	x0, 0xffff800081ae0f7c <gather_bootmem_prealloc+116>
   0xffff800081ae0f6c <+100>:	eb00027f	cmp	x19, x0
   0xffff800081ae0f70 <+104>:	54000060	b.eq	0xffff800081ae0f7c <gather_bootmem_prealloc+116>  // b.none

3307				prep_and_add_allocated_folios(prev_h, &folio_list);
   0xffff800081ae0f74 <+108>:	910023e1	add	x1, sp, #0x8
   0xffff800081ae0f78 <+112>:	979ecd63	bl	0xffff800080294504 <prep_and_add_allocated_folios>
   0xffff800081ae0f7c <+116>:	8b160328	add	x8, x25, x22
   0xffff800081ae0f80 <+120>:	d34cfd08	lsr	x8, x8, #12
   0xffff800081ae0f84 <+124>:	8b081af4	add	x20, x23, x8, lsl #6

./include/linux/atomic/atomic-arch-fallback.h:
444		return arch_atomic_read(v);
   0xffff800081ae0f88 <+128>:	b9403688	ldr	w8, [x20, #52]

mm/hugetlb.c:
3311			WARN_ON(folio_ref_count(folio) != 1);
   0xffff800081ae0f8c <+132>:	7100051f	cmp	w8, #0x1
   0xffff800081ae0f90 <+136>:	54000581	b.ne	0xffff800081ae1040 <gather_bootmem_prealloc+312>  // b.any

3312	
3313			hugetlb_folio_init_vmemmap(folio, h,
   0xffff800081ae0f94 <+140>:	aa1403e0	mov	x0, x20
   0xffff800081ae0f98 <+144>:	aa1303e1	mov	x1, x19
   0xffff800081ae0f9c <+148>:	940001a2	bl	0xffff800081ae1624 <hugetlb_folio_init_vmemmap>

./arch/arm64/include/asm/alternative-macros.h:
232		asm_volatile_goto(
   0xffff800081ae0fa0 <+152>:	1400002a	b	0xffff800081ae1048 <gather_bootmem_prealloc+320>

./arch/arm64/include/asm/atomic_lse.h:
132	ATOMIC64_OP(or, stset)
   0xffff800081ae0fa4 <+156>:	91010288	add	x8, x20, #0x40
   0xffff800081ae0fa8 <+160>:	f838311f	stset	x24, [x8]

mm/hugetlb.c:
1969		INIT_LIST_HEAD(&folio->lru);
   0xffff800081ae0fac <+164>:	9100229c	add	x28, x20, #0x8

./include/linux/list.h:
37		WRITE_ONCE(list->next, list);
   0xffff800081ae0fb0 <+168>:	f900069c	str	x28, [x20, #8]

38		WRITE_ONCE(list->prev, list);
   0xffff800081ae0fb4 <+172>:	f9000a9c	str	x28, [x20, #16]

./include/linux/hugetlb.h:
753		folio->_hugetlb_subpool = subpool;
   0xffff800081ae0fb8 <+176>:	f9004a9f	str	xzr, [x20, #144]

./include/asm-generic/bitops/generic-non-atomic.h:
128		return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
   0xffff800081ae0fbc <+180>:	f9400288	ldr	x8, [x20]

./include/linux/mm.h:
1070		if (!folio_test_large(folio))
   0xffff800081ae0fc0 <+184>:	363000a8	tbz	w8, #6, 0xffff800081ae0fd4 <gather_bootmem_prealloc+204>

./include/linux/hugetlb_cgroup.h:
94		if (folio_order(folio) < HUGETLB_CGROUP_MIN_ORDER)
   0xffff800081ae0fc4 <+188>:	39410288	ldrb	w8, [x20, #64]
   0xffff800081ae0fc8 <+192>:	721f191f	tst	w8, #0xfe
   0xffff800081ae0fcc <+196>:	54000040	b.eq	0xffff800081ae0fd4 <gather_bootmem_prealloc+204>  // b.none

98		else
99			folio->_hugetlb_cgroup = h_cg;
   0xffff800081ae0fd0 <+200>:	f9004e9f	str	xzr, [x20, #152]

./include/asm-generic/bitops/generic-non-atomic.h:
128		return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
   0xffff800081ae0fd4 <+204>:	f9400288	ldr	x8, [x20]

./include/linux/mm.h:
1070		if (!folio_test_large(folio))
   0xffff800081ae0fd8 <+208>:	363000a8	tbz	w8, #6, 0xffff800081ae0fec <gather_bootmem_prealloc+228>

./include/linux/hugetlb_cgroup.h:
94		if (folio_order(folio) < HUGETLB_CGROUP_MIN_ORDER)
   0xffff800081ae0fdc <+212>:	39410288	ldrb	w8, [x20, #64]
   0xffff800081ae0fe0 <+216>:	721f191f	tst	w8, #0xfe
   0xffff800081ae0fe4 <+220>:	54000040	b.eq	0xffff800081ae0fec <gather_bootmem_prealloc+228>  // b.none

95			return;
96		if (rsvd)
97			folio->_hugetlb_cgroup_rsvd = h_cg;
   0xffff800081ae0fe8 <+224>:	f900529f	str	xzr, [x20, #160]

./include/asm-generic/bitops/generic-non-atomic.h:
128		return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
   0xffff800081ae0fec <+228>:	f9401688	ldr	x8, [x20, #40]

mm/hugetlb.c:
3317			if (!HPageVmemmapOptimized(&folio->page))
   0xffff800081ae0ff0 <+232>:	362001c8	tbz	w8, #4, 0xffff800081ae1028 <gather_bootmem_prealloc+288>

./include/linux/list.h:
169		__list_add(new, head, head->next);
   0xffff800081ae0ff4 <+236>:	f94007e8	ldr	x8, [sp, #8]

mm/hugetlb.c:
3328			adjust_managed_page_count(page, pages_per_huge_page(h));
   0xffff800081ae0ff8 <+240>:	aa1403e0	mov	x0, x20

./include/linux/list.h:
153		next->prev = new;
   0xffff800081ae0ffc <+244>:	f900051c	str	x28, [x8, #8]

154		new->next = next;
   0xffff800081ae1000 <+248>:	a900d688	stp	x8, x21, [x20, #8]

155		new->prev = prev;
156		WRITE_ONCE(prev->next, new);
   0xffff800081ae1004 <+252>:	f90007fc	str	x28, [sp, #8]

./include/linux/hugetlb.h:
808		return 1 << h->order;
   0xffff800081ae1008 <+256>:	b9402a68	ldr	w8, [x19, #40]
   0xffff800081ae100c <+260>:	9ac82341	lsl	x1, x26, x8

mm/hugetlb.c:
3328			adjust_managed_page_count(page, pages_per_huge_page(h));
   0xffff800081ae1010 <+264>:	979e566a	bl	0xffff8000802769b8 <adjust_managed_page_count>

3297		list_for_each_entry(m, &huge_boot_pages, list) {
   0xffff800081ae1014 <+268>:	f9400339	ldr	x25, [x25]
   0xffff800081ae1018 <+272>:	aa1303e0	mov	x0, x19
   0xffff800081ae101c <+276>:	eb1b033f	cmp	x25, x27
   0xffff800081ae1020 <+280>:	54fffa21	b.ne	0xffff800081ae0f64 <gather_bootmem_prealloc+92>  // b.any
   0xffff800081ae1024 <+284>:	14000011	b	0xffff800081ae1068 <gather_bootmem_prealloc+352>

./include/linux/hugetlb.h:
808		return 1 << h->order;
   0xffff800081ae1028 <+288>:	b9402a68	ldr	w8, [x19, #40]

mm/hugetlb.c:
3318				hugetlb_folio_init_tail_vmemmap(folio,
   0xffff800081ae102c <+292>:	aa1403e0	mov	x0, x20
   0xffff800081ae1030 <+296>:	52800801	mov	w1, #0x40                  	// #64

./include/linux/hugetlb.h:
808		return 1 << h->order;
   0xffff800081ae1034 <+300>:	9ac82342	lsl	x2, x26, x8

mm/hugetlb.c:
3318				hugetlb_folio_init_tail_vmemmap(folio,
   0xffff800081ae1038 <+304>:	940001ae	bl	0xffff800081ae16f0 <hugetlb_folio_init_tail_vmemmap>
   0xffff800081ae103c <+308>:	17ffffee	b	0xffff800081ae0ff4 <gather_bootmem_prealloc+236>
   0xffff800081ae1040 <+312>:	d4210000	brk	#0x800
   0xffff800081ae1044 <+316>:	17ffffd4	b	0xffff800081ae0f94 <gather_bootmem_prealloc+140>

./arch/arm64/include/asm/atomic_ll_sc.h:
203	ATOMIC64_OPS(or, orr, L)
   0xffff800081ae1048 <+320>:	d503249f	bti	j
   0xffff800081ae104c <+324>:	91010288	add	x8, x20, #0x40
   0xffff800081ae1050 <+328>:	f9800111	prfm	pstl1strm, [x8]
   0xffff800081ae1054 <+332>:	c85f7d09	ldxr	x9, [x8]
   0xffff800081ae1058 <+336>:	b2780129	orr	x9, x9, #0x100
   0xffff800081ae105c <+340>:	c80a7d09	stxr	w10, x9, [x8]
   0xffff800081ae1060 <+344>:	35ffffaa	cbnz	w10, 0xffff800081ae1054 <gather_bootmem_prealloc+332>
   0xffff800081ae1064 <+348>:	17ffffd2	b	0xffff800081ae0fac <gather_bootmem_prealloc+164>

mm/hugetlb.c:
3332		prep_and_add_allocated_folios(h, &folio_list);
   0xffff800081ae1068 <+352>:	910023e1	add	x1, sp, #0x8
   0xffff800081ae106c <+356>:	aa1303e0	mov	x0, x19
   0xffff800081ae1070 <+360>:	979ecd25	bl	0xffff800080294504 <prep_and_add_allocated_folios>
   0xffff800081ae1074 <+364>:	d5384108	mrs	x8, sp_el0
   0xffff800081ae1078 <+368>:	f9423d08	ldr	x8, [x8, #1144]
   0xffff800081ae107c <+372>:	f85f83a9	ldur	x9, [x29, #-8]
   0xffff800081ae1080 <+376>:	eb09011f	cmp	x8, x9
   0xffff800081ae1084 <+380>:	54000141	b.ne	0xffff800081ae10ac <gather_bootmem_prealloc+420>  // b.any

3333	}
   0xffff800081ae1088 <+384>:	a9474ff4	ldp	x20, x19, [sp, #112]
   0xffff800081ae108c <+388>:	a94657f6	ldp	x22, x21, [sp, #96]
   0xffff800081ae1090 <+392>:	a9455ff8	ldp	x24, x23, [sp, #80]
   0xffff800081ae1094 <+396>:	a94467fa	ldp	x26, x25, [sp, #64]
   0xffff800081ae1098 <+400>:	a9436ffc	ldp	x28, x27, [sp, #48]
   0xffff800081ae109c <+404>:	a9427bfd	ldp	x29, x30, [sp, #32]
   0xffff800081ae10a0 <+408>:	910203ff	add	sp, sp, #0x80
   0xffff800081ae10a4 <+412>:	d50323bf	autiasp
   0xffff800081ae10a8 <+416>:	d65f03c0	ret
   0xffff800081ae10ac <+420>:	97d6228a	bl	0xffff800081069ad4 <__stack_chk_fail>
End of assembler dump.

[-- Attachment #3: disass_INIT_STACK_ALL_ZERO --]
[-- Type: text/plain, Size: 10136 bytes --]

Dump of assembler code for function gather_bootmem_prealloc:
mm/hugetlb.c:
3292	{
   0xffff800081b0111c <+0>:	d503233f	paciasp
   0xffff800081b01120 <+4>:	d10203ff	sub	sp, sp, #0x80
   0xffff800081b01124 <+8>:	a9027bfd	stp	x29, x30, [sp, #32]
   0xffff800081b01128 <+12>:	a9036ffc	stp	x28, x27, [sp, #48]
   0xffff800081b0112c <+16>:	a90467fa	stp	x26, x25, [sp, #64]
   0xffff800081b01130 <+20>:	a9055ff8	stp	x24, x23, [sp, #80]
   0xffff800081b01134 <+24>:	a90657f6	stp	x22, x21, [sp, #96]
   0xffff800081b01138 <+28>:	a9074ff4	stp	x20, x19, [sp, #112]
   0xffff800081b0113c <+32>:	910083fd	add	x29, sp, #0x20
   0xffff800081b01140 <+36>:	d5384108	mrs	x8, sp_el0

3294		struct huge_bootmem_page *m;
3295		struct hstate *h, *prev_h = NULL;
3296	
3297		list_for_each_entry(m, &huge_boot_pages, list) {
   0xffff800081b01144 <+40>:	d503201f	nop
   0xffff800081b01148 <+44>:	107b6395	adr	x21, 0xffff800081bf7db8 <huge_boot_pages>
   0xffff800081b0114c <+48>:	f9423d08	ldr	x8, [x8, #1144]
   0xffff800081b01150 <+52>:	910023f6	add	x22, sp, #0x8
   0xffff800081b01154 <+56>:	f81f83a8	stur	x8, [x29, #-8]
   0xffff800081b01158 <+60>:	f94002b7	ldr	x23, [x21]

3293		LIST_HEAD(folio_list);
   0xffff800081b0115c <+64>:	a900dbf6	stp	x22, x22, [sp, #8]

3294		struct huge_bootmem_page *m;
3295		struct hstate *h, *prev_h = NULL;
3296	
3297		list_for_each_entry(m, &huge_boot_pages, list) {
   0xffff800081b01160 <+68>:	eb1502ff	cmp	x23, x21
   0xffff800081b01164 <+72>:	540008e0	b.eq	0xffff800081b01280 <gather_bootmem_prealloc+356>  // b.none
   0xffff800081b01168 <+76>:	aa1f03e0	mov	x0, xzr
   0xffff800081b0116c <+80>:	d2e00038	mov	x24, #0x1000000000000       	// #281474976710656
   0xffff800081b01170 <+84>:	b25657f9	mov	x25, #0xfffffc0000000000    	// #-4398046511104
   0xffff800081b01174 <+88>:	5280201a	mov	w26, #0x100                 	// #256
   0xffff800081b01178 <+92>:	5280003b	mov	w27, #0x1                   	// #1

3298			struct page *page = virt_to_page(m);
3299			struct folio *folio = (void *)page;
3300	
3301			h = m->hstate;
   0xffff800081b0117c <+96>:	f9400af3	ldr	x19, [x23, #16]

3302			/*
3303			 * It is possible to have multiple huge page sizes (hstates)
3304			 * in this list.  If so, process each size separately.
3305			 */
3306			if (h != prev_h && prev_h != NULL)
   0xffff800081b01180 <+100>:	b40000a0	cbz	x0, 0xffff800081b01194 <gather_bootmem_prealloc+120>
   0xffff800081b01184 <+104>:	eb00027f	cmp	x19, x0
   0xffff800081b01188 <+108>:	54000060	b.eq	0xffff800081b01194 <gather_bootmem_prealloc+120>  // b.none

3307				prep_and_add_allocated_folios(prev_h, &folio_list);
   0xffff800081b0118c <+112>:	910023e1	add	x1, sp, #0x8
   0xffff800081b01190 <+116>:	979e5a34	bl	0xffff800080297a60 <prep_and_add_allocated_folios>
   0xffff800081b01194 <+120>:	8b1802e8	add	x8, x23, x24
   0xffff800081b01198 <+124>:	d34cfd08	lsr	x8, x8, #12
   0xffff800081b0119c <+128>:	8b081b34	add	x20, x25, x8, lsl #6

./include/linux/atomic/atomic-arch-fallback.h:
444		return arch_atomic_read(v);
   0xffff800081b011a0 <+132>:	b9403688	ldr	w8, [x20, #52]

mm/hugetlb.c:
3311			WARN_ON(folio_ref_count(folio) != 1);
   0xffff800081b011a4 <+136>:	7100051f	cmp	w8, #0x1
   0xffff800081b011a8 <+140>:	54000581	b.ne	0xffff800081b01258 <gather_bootmem_prealloc+316>  // b.any

3312	
3313			hugetlb_folio_init_vmemmap(folio, h,
   0xffff800081b011ac <+144>:	aa1403e0	mov	x0, x20
   0xffff800081b011b0 <+148>:	aa1303e1	mov	x1, x19
   0xffff800081b011b4 <+152>:	940001a9	bl	0xffff800081b01858 <hugetlb_folio_init_vmemmap>

./arch/arm64/include/asm/alternative-macros.h:
232		asm_volatile_goto(
   0xffff800081b011b8 <+156>:	1400002a	b	0xffff800081b01260 <gather_bootmem_prealloc+324>

./arch/arm64/include/asm/atomic_lse.h:
132	ATOMIC64_OP(or, stset)
   0xffff800081b011bc <+160>:	91010288	add	x8, x20, #0x40
   0xffff800081b011c0 <+164>:	f83a311f	stset	x26, [x8]

mm/hugetlb.c:
1969		INIT_LIST_HEAD(&folio->lru);
   0xffff800081b011c4 <+168>:	9100229c	add	x28, x20, #0x8

./include/linux/list.h:
37		WRITE_ONCE(list->next, list);
   0xffff800081b011c8 <+172>:	f900069c	str	x28, [x20, #8]

38		WRITE_ONCE(list->prev, list);
   0xffff800081b011cc <+176>:	f9000a9c	str	x28, [x20, #16]

./include/linux/hugetlb.h:
753		folio->_hugetlb_subpool = subpool;
   0xffff800081b011d0 <+180>:	f9004a9f	str	xzr, [x20, #144]

./include/asm-generic/bitops/generic-non-atomic.h:
128		return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
   0xffff800081b011d4 <+184>:	f9400288	ldr	x8, [x20]

./include/linux/mm.h:
1070		if (!folio_test_large(folio))
   0xffff800081b011d8 <+188>:	363000a8	tbz	w8, #6, 0xffff800081b011ec <gather_bootmem_prealloc+208>

./include/linux/hugetlb_cgroup.h:
94		if (folio_order(folio) < HUGETLB_CGROUP_MIN_ORDER)
   0xffff800081b011dc <+192>:	39410288	ldrb	w8, [x20, #64]
   0xffff800081b011e0 <+196>:	721f191f	tst	w8, #0xfe
   0xffff800081b011e4 <+200>:	54000040	b.eq	0xffff800081b011ec <gather_bootmem_prealloc+208>  // b.none

98		else
99			folio->_hugetlb_cgroup = h_cg;
   0xffff800081b011e8 <+204>:	f9004e9f	str	xzr, [x20, #152]

./include/asm-generic/bitops/generic-non-atomic.h:
128		return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
   0xffff800081b011ec <+208>:	f9400288	ldr	x8, [x20]

./include/linux/mm.h:
1070		if (!folio_test_large(folio))
   0xffff800081b011f0 <+212>:	363000a8	tbz	w8, #6, 0xffff800081b01204 <gather_bootmem_prealloc+232>

./include/linux/hugetlb_cgroup.h:
94		if (folio_order(folio) < HUGETLB_CGROUP_MIN_ORDER)
   0xffff800081b011f4 <+216>:	39410288	ldrb	w8, [x20, #64]
   0xffff800081b011f8 <+220>:	721f191f	tst	w8, #0xfe
   0xffff800081b011fc <+224>:	54000040	b.eq	0xffff800081b01204 <gather_bootmem_prealloc+232>  // b.none

95			return;
96		if (rsvd)
97			folio->_hugetlb_cgroup_rsvd = h_cg;
   0xffff800081b01200 <+228>:	f900529f	str	xzr, [x20, #160]

./include/asm-generic/bitops/generic-non-atomic.h:
128		return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
   0xffff800081b01204 <+232>:	f9401688	ldr	x8, [x20, #40]

mm/hugetlb.c:
3317			if (!HPageVmemmapOptimized(&folio->page))
   0xffff800081b01208 <+236>:	362001c8	tbz	w8, #4, 0xffff800081b01240 <gather_bootmem_prealloc+292>

./include/linux/list.h:
169		__list_add(new, head, head->next);
   0xffff800081b0120c <+240>:	f94007e8	ldr	x8, [sp, #8]

mm/hugetlb.c:
3328			adjust_managed_page_count(page, pages_per_huge_page(h));
   0xffff800081b01210 <+244>:	aa1403e0	mov	x0, x20

./include/linux/list.h:
153		next->prev = new;
   0xffff800081b01214 <+248>:	f900051c	str	x28, [x8, #8]

154		new->next = next;
   0xffff800081b01218 <+252>:	a900da88	stp	x8, x22, [x20, #8]

155		new->prev = prev;
156		WRITE_ONCE(prev->next, new);
   0xffff800081b0121c <+256>:	f90007fc	str	x28, [sp, #8]

./include/linux/hugetlb.h:
808		return 1 << h->order;
   0xffff800081b01220 <+260>:	b9402a68	ldr	w8, [x19, #40]
   0xffff800081b01224 <+264>:	9ac82361	lsl	x1, x27, x8

mm/hugetlb.c:
3328			adjust_managed_page_count(page, pages_per_huge_page(h));
   0xffff800081b01228 <+268>:	979de2fd	bl	0xffff800080279e1c <adjust_managed_page_count>

3297		list_for_each_entry(m, &huge_boot_pages, list) {
   0xffff800081b0122c <+272>:	f94002f7	ldr	x23, [x23]
   0xffff800081b01230 <+276>:	aa1303e0	mov	x0, x19
   0xffff800081b01234 <+280>:	eb1502ff	cmp	x23, x21
   0xffff800081b01238 <+284>:	54fffa21	b.ne	0xffff800081b0117c <gather_bootmem_prealloc+96>  // b.any
   0xffff800081b0123c <+288>:	14000012	b	0xffff800081b01284 <gather_bootmem_prealloc+360>

./include/linux/hugetlb.h:
808		return 1 << h->order;
   0xffff800081b01240 <+292>:	b9402a68	ldr	w8, [x19, #40]

mm/hugetlb.c:
3318				hugetlb_folio_init_tail_vmemmap(folio,
   0xffff800081b01244 <+296>:	aa1403e0	mov	x0, x20
   0xffff800081b01248 <+300>:	52800801	mov	w1, #0x40                  	// #64

./include/linux/hugetlb.h:
808		return 1 << h->order;
   0xffff800081b0124c <+304>:	9ac82362	lsl	x2, x27, x8

mm/hugetlb.c:
3318				hugetlb_folio_init_tail_vmemmap(folio,
   0xffff800081b01250 <+308>:	940001b5	bl	0xffff800081b01924 <hugetlb_folio_init_tail_vmemmap>
   0xffff800081b01254 <+312>:	17ffffee	b	0xffff800081b0120c <gather_bootmem_prealloc+240>
   0xffff800081b01258 <+316>:	d4210000	brk	#0x800
   0xffff800081b0125c <+320>:	17ffffd4	b	0xffff800081b011ac <gather_bootmem_prealloc+144>

./arch/arm64/include/asm/atomic_ll_sc.h:
203	ATOMIC64_OPS(or, orr, L)
   0xffff800081b01260 <+324>:	d503249f	bti	j
   0xffff800081b01264 <+328>:	91010288	add	x8, x20, #0x40
   0xffff800081b01268 <+332>:	f9800111	prfm	pstl1strm, [x8]
   0xffff800081b0126c <+336>:	c85f7d09	ldxr	x9, [x8]
   0xffff800081b01270 <+340>:	b2780129	orr	x9, x9, #0x100
   0xffff800081b01274 <+344>:	c80a7d09	stxr	w10, x9, [x8]
   0xffff800081b01278 <+348>:	35ffffaa	cbnz	w10, 0xffff800081b0126c <gather_bootmem_prealloc+336>
   0xffff800081b0127c <+352>:	17ffffd2	b	0xffff800081b011c4 <gather_bootmem_prealloc+168>
   0xffff800081b01280 <+356>:	aa1f03f3	mov	x19, xzr

mm/hugetlb.c:
3332		prep_and_add_allocated_folios(h, &folio_list);
   0xffff800081b01284 <+360>:	910023e1	add	x1, sp, #0x8
   0xffff800081b01288 <+364>:	aa1303e0	mov	x0, x19
   0xffff800081b0128c <+368>:	979e59f5	bl	0xffff800080297a60 <prep_and_add_allocated_folios>
   0xffff800081b01290 <+372>:	d5384108	mrs	x8, sp_el0
   0xffff800081b01294 <+376>:	f9423d08	ldr	x8, [x8, #1144]
   0xffff800081b01298 <+380>:	f85f83a9	ldur	x9, [x29, #-8]
   0xffff800081b0129c <+384>:	eb09011f	cmp	x8, x9
   0xffff800081b012a0 <+388>:	54000141	b.ne	0xffff800081b012c8 <gather_bootmem_prealloc+428>  // b.any

3333	}
   0xffff800081b012a4 <+392>:	a9474ff4	ldp	x20, x19, [sp, #112]
   0xffff800081b012a8 <+396>:	a94657f6	ldp	x22, x21, [sp, #96]
   0xffff800081b012ac <+400>:	a9455ff8	ldp	x24, x23, [sp, #80]
   0xffff800081b012b0 <+404>:	a94467fa	ldp	x26, x25, [sp, #64]
   0xffff800081b012b4 <+408>:	a9436ffc	ldp	x28, x27, [sp, #48]
   0xffff800081b012b8 <+412>:	a9427bfd	ldp	x29, x30, [sp, #32]
   0xffff800081b012bc <+416>:	910203ff	add	sp, sp, #0x80
   0xffff800081b012c0 <+420>:	d50323bf	autiasp
   0xffff800081b012c4 <+424>:	d65f03c0	ret
   0xffff800081b012c8 <+428>:	97d5f73b	bl	0xffff80008107efb4 <__stack_chk_fail>
End of assembler dump.

  reply	other threads:[~2023-10-13  0:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-09 14:56 [PATCH] mm: hugetlb: Only prep and add allocated folios for non-gigantic pages Usama Arif
2023-10-10  1:23 ` Mike Kravetz
2023-10-10 17:01   ` [External] " Usama Arif
2023-10-10 21:30     ` Mike Kravetz
2023-10-10 21:31       ` Konrad Dybcio
2023-10-12  0:03   ` Nathan Chancellor
2023-10-12 14:53     ` Mike Kravetz
2023-10-13  0:12       ` Mike Kravetz [this message]
2023-10-14  0:04         ` Mike Kravetz
2023-10-18 20:54           ` Nick Desaulniers
2023-10-18 22:20             ` Mike Kravetz
2023-10-19  4:33               ` Sergey Senozhatsky
2023-10-19 14:20                 ` Nathan Chancellor
2023-10-19  2:38 ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231013001203.GA3812@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=fam.zheng@bytedance.com \
    --cc=konrad.dybcio@linaro.org \
    --cc=liangma@liangbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=llvm@lists.linux.dev \
    --cc=muchun.song@linux.dev \
    --cc=nathan@kernel.org \
    --cc=punit.agrawal@bytedance.com \
    --cc=songmuchun@bytedance.com \
    --cc=usama.arif@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.