Re: [PATCH v2] powerpc/mm: Update default hugetlb size early

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>,
	linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
	Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH v2] powerpc/mm: Update default hugetlb size early
Date: Fri, 11 Feb 2022 20:10:51 +0530	[thread overview]
Message-ID: <87r189a2ks.fsf@linux.ibm.com> (raw)
In-Reply-To: <87tud5a8x5.fsf@linux.ibm.com>

Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:

> David Hildenbrand <david@redhat.com> writes:
>
>> On 11.02.22 10:16, Aneesh Kumar K V wrote:
>>> On 2/11/22 14:00, David Hildenbrand wrote:
>>>> On 11.02.22 07:52, Aneesh Kumar K.V wrote:
>>>>> commit: d9c234005227 ("Do not depend on MAX_ORDER when grouping pages by mobility")
....
....

> I could build a kernel with FORCE_MAX_ZONEORDER=8 and pageblock_order =
> 8. We need to disable THP for such a kernel to boot, because THP do
> check for PMD_ORDER < MAX_ORDER. I was able to boot that kernel on a
> virtualized platform, but then gigantic_page_runtime_supported is not
> supported on such config with hash translation.
>
> On non virtualized platform I am hitting crashes like below during boot.
>
> [   47.637865][   C42] =============================================================================                                                                                                                                                                                                              
> [   47.637907][   C42] BUG pgtable-2^11 (Not tainted): Object already free                                                                                     
> [   47.637925][   C42] -----------------------------------------------------------------------------                                                           
> [   47.637925][   C42]                                                                                                                                         
> [   47.637945][   C42] Allocated in __pud_alloc+0x84/0x2a0 age=278 cpu=40 pid=1409                                                                             
> [   47.637974][   C42]  __slab_alloc.isra.0+0x40/0x60                                                                                                          
> [   47.637995][   C42]  kmem_cache_alloc+0x1a8/0x510                                                                                                           
> [   47.638010][   C42]  __pud_alloc+0x84/0x2a0                                                                                                                 
> [   47.638024][   C42]  copy_page_range+0x38c/0x1b90                                                                                                           
> [   47.638040][   C42]  dup_mm+0x548/0x880                                                                                                                     
> [   47.638058][   C42]  copy_process+0xdc0/0x1e90                                                                                                              
> [   47.638076][   C42]  kernel_clone+0xd4/0x9d0                                                                                                                
> [   47.638094][   C42]  __do_sys_clone+0x88/0xe0                                                                                                               
> [   47.638112][   C42]  system_call_exception+0x368/0x3a0                                                                                                      
> [   47.638128][   C42]  system_call_common+0xec/0x250                                                                                                          
> [   47.638147][   C42] Freed in __tlb_remove_table+0x1d4/0x200 age=263 cpu=57 pid=326                                                                          
> [   47.638172][   C42]  kmem_cache_free+0x44c/0x680                                                                                                            
> [   47.638187][   C42]  __tlb_remove_table+0x1d4/0x200                                                                                                         
> [   47.638204][   C42]  tlb_remove_table_rcu+0x54/0xa0                                                                                                         
> [   47.638222][   C42]  rcu_core+0xdd4/0x15d0                                                                                                                  
> [   47.638239][   C42]  __do_softirq+0x360/0x69c                                                                                                               
> [   47.638257][   C42]  run_ksoftirqd+0x54/0xc0                                                                                                                
> [   47.638273][   C42]  smpboot_thread_fn+0x28c/0x2f0                                                                                                          
> [   47.638290][   C42]  kthread+0x1a4/0x1b0                                                                                                                    
> [   47.638305][   C42]  ret_from_kernel_thread+0x5c/0x64                                                                                                       
> [   47.638320][   C42] Slab 0xc00c00000000d600 objects=10 used=9 fp=0xc0000000035a8000 flags=0x7ffff000010201(locked|slab|head|node=0|zone=0|lastcpupid=0x7ffff)                                                                                                                                                              
> [   47.638352][   C42] Object 0xc0000000035a8000 @offset=163840 fp=0x0000000000000000                                                                          
> [   47.638352][   C42]                                                                                                                                         
> [   47.638373][   C42] Redzone  c0000000035a4000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638394][   C42] Redzone  c0000000035a4010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638414][   C42] Redzone  c0000000035a4020: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638435][   C42] Redzone  c0000000035a4030: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638455][   C42] Redzone  c0000000035a4040: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638474][   C42] Redzone  c0000000035a4050: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638494][   C42] Redzone  c0000000035a4060: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638514][   C42] Redzone  c0000000035a4070: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638534][   C42] Redzone  c0000000035a4080: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            

Ok that turned out to be unrelated. I was using a wrong kernel. I can
boot kernel with pageblock_order > MAX_ORDER and run hugetlb related
test fine. I do get the below warning which you had already called out
in your patch.

[    3.952124] WARNING: CPU: 16 PID: 719 at mm/vmstat.c:1103 __fragmentation_index+0x14/0x70                                                                   
[    3.952136] Modules linked in:                                                                                                                              
[    3.952141] CPU: 16 PID: 719 Comm: kswapd0 Tainted: G    B             5.17.0-rc3-00044-g69052ffa0e08 #68                                                   
[    3.952149] NIP:  c000000000465264 LR: c000000000468544 CTR: 0000000000000000                                                                               
[    3.952154] REGS: c000000014a4f7e0 TRAP: 0700   Tainted: G    B              (5.17.0-rc3-00044-g69052ffa0e08)
[    3.952161] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 44042422  XER: 20000000
[    3.952174] CFAR: c000000000468540 IRQMASK: 0                  
               GPR00: c000000000468544 c000000014a4fa80 c000000001ea9500 0000000000000008 
               GPR04: c000000014a4faa0 00000000001fd700 0000000000004003 00000000001fd92d 
               GPR08: c000001fffd1c7a0 0000000000000008 0000000000000008 0000000000000000 
               GPR12: 0000000000002200 c000001fffff2880 0000000000000000 c000000013cfd240                                                                      
               GPR16: c000000011940600 c000001fffd21058 0000000000000d00 c000000001407d30                                                                      
               GPR20: ffffffffffffffaf c000001fffd21098 0000000000000000 c000000002ab7328                                                                      
               GPR24: c000000011940600 c000001fffd21300 0000000000000000 0000000000000008 
               GPR28: c000001fffd1c280 0000000000000008 0000000000000000 0000000000000004                                                                      
[    3.952231] NIP [c000000000465264] __fragmentation_index+0x14/0x70                                                                                          
[    3.952237] LR [c000000000468544] fragmentation_index+0xb4/0xe0                                                                                             
[    3.952244] Call Trace:                                        
[    3.952247] [c000000014a4fa80] [c00000000023e248] lock_release+0x138/0x470 (unreliable)
[    3.952256] [c000000014a4fac0] [c00000000047cd84] compaction_suitable+0x94/0x270
[    3.952263] [c000000014a4fb10] [c0000000004802b8] wakeup_kcompactd+0xc8/0x2a0
[    3.952270] [c000000014a4fb60] [c000000000457568] balance_pgdat+0x798/0x8d0
[    3.952277] [c000000014a4fca0] [c000000000457d14] kswapd+0x674/0x7b0                                                                                        
[    3.952283] [c000000014a4fdc0] [c0000000001d7e84] kthread+0x144/0x150                                                                                       
[    3.952290] [c000000014a4fe10] [c00000000000cd74] ret_from_kernel_thread+0x5c/0x64
[    3.952297] Instruction dump:                                      
[    3.952301] 7d2021ad 40c2fff4 e8ed0030 38a00000 7caa39ae 4e800020 60000000 7c0802a6 
[    3.952311] 60000000 28030007 7c6a1b78 40810010 <0fe00000> 60000000 60000000 e9040008 
[    3.952322] irq event stamp: 0                                        
[    3.952325] hardirqs last  enabled at (0): [<0000000000000000>] 0x0                                                                                         
[    3.952331] hardirqs last disabled at (0): [<c000000000196030>] copy_process+0x970/0x1de0                                                                   
[    3.952339] softirqs last  enabled at (0): [<c000000000196030>] copy_process+0x970/0x1de0                                                                   
[    3.952345] softirqs last disabled at (0): [<0000000000000000>] 0x0                                                                                         

I am not sure whether there is any value in selecting MAX_ORDER = 8 on
ppc64. If not we could do a patch as below for ppc64.

commit 09ed79c4fda92418914546f36c2750670503d7a0
Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Date:   Fri Feb 11 17:15:10 2022 +0530

    powerpc/mm: Disable MAX_ORDER value 8 on book3s64 with 64K pagesize
    
    With transparent hugepage support we expect HPAGE_PMD_ORDER < MAX_ORDER.
    Without this we BUG() during boot as below
    
    cpu 0x6: Vector: 700 (Program Check) at [c000000012143880]
        pc: c000000001b4ddbc: hugepage_init+0x108/0x2c4
        lr: c000000001b4dd98: hugepage_init+0xe4/0x2c4
        sp: c000000012143b20
       msr: 8000000002029033
      current = 0xc0000000120d0f80
      paca    = 0xc00000001ec7e900   irqmask: 0x03   irq_happened: 0x01
        pid   = 1, comm = swapper/0
    kernel BUG at mm/huge_memory.c:413!
    [c000000012143b20] c0000000022c0468 blacklisted_initcalls+0x120/0x1c8 (unreliable)
    [c000000012143bb0] c000000000012104 do_one_initcall+0x94/0x520
    [c000000012143c90] c000000001b04da0 kernel_init_freeable+0x444/0x508
    [c000000012143da0] c000000000012d8c kernel_init+0x44/0x188
    [c000000012143e10] c00000000000cbf4 ret_from_kernel_thread+0x5c/0x64
    
    Hence a FORCE_MAX_ZONEORDER of value < 9 doesn't make sense with THP
    enabled. We also cannot have value > 9 because we are limitted by
    SECTION_SIZE_BITS
    
     #if (MAX_ORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS
     #error Allocator MAX_ORDER exceeds SECTION_SIZE
     #endif
    
    We can select MAX_ORDER value 8 by disabling THP support but then that
    results in pageblock_order > MAX_ORDER - 1 which is not fully tested/supported.
    
    Cc: David Hildenbrand <david@redhat.com>
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b779603978e1..a050f5f46df3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -807,7 +807,7 @@ config DATA_SHIFT
 
 config FORCE_MAX_ZONEORDER
 	int "Maximum zone order"
-	range 8 9 if PPC64 && PPC_64K_PAGES
+	range 9 9 if PPC64 && PPC_64K_PAGES
 	default "9" if PPC64 && PPC_64K_PAGES
 	range 13 13 if PPC64 && !PPC_64K_PAGES
 	default "13" if PPC64 && !PPC_64K_PAGES

next prev parent reply	other threads:[~2022-02-11 14:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-11  6:52 [PATCH v2] powerpc/mm: Update default hugetlb size early Aneesh Kumar K.V
2022-02-11  8:30 ` David Hildenbrand
2022-02-11  9:16   ` Aneesh Kumar K V
2022-02-11 10:05     ` David Hildenbrand
2022-02-11 12:23       ` Aneesh Kumar K.V
2022-02-11 12:29         ` David Hildenbrand
2022-02-11 14:40         ` Aneesh Kumar K.V [this message]
2022-02-16 12:25 ` Michael Ellerman

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:b779603978e dfblob:a050f5f46df )
 OR (
bs:"Re: [PATCH v2] powerpc/mm: Update default hugetlb size early" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r189a2ks.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.