From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 55D3F1A1E36 for ; Wed, 2 Sep 2015 20:20:22 +1000 (AEST) Message-ID: <1441189221.7208.19.camel@neuling.org> Subject: Re: [RESEND] Soft lockup after huge page offline attempt From: Michael Neuling To: Anshuman Khandual Cc: Linux PPC , "Aneesh Kumar K.V" , Michael Ellerman , Nishanth Aravamudan Date: Wed, 02 Sep 2015 20:20:21 +1000 In-Reply-To: <55E6CA60.6030502@linux.vnet.ibm.com> References: <55E6CA60.6030502@linux.vnet.ibm.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, 2015-09-02 at 15:37 +0530, Anshuman Khandual wrote: > Hello, >=20 > Worked on a modified version of the following selftest > test case in the kernel source. >=20 > tools/testing/selftests/vm/hugepage-mmap.c Please send the test case and how to run it here. Mikey > This modified test will try to do the following sequence of > events in the same order. >=20 > (1) Allocate 16 Huge TLB pages (16MB each) through mmap >=20 > (2) mlock() the entire range of 256MB memory >=20 > (3) Fetch PFN from /proc/self/pagemap for each huge TLB page > starting address present in the 256MB range >=20 > (4) Convert each PFN into PA as PFN << PAGE_SHIFT which is then > written into /sys/devices/system/memory/soft_offline_page >=20 > (5) unmap() the 256MB memory range >=20 > (6) unlink() the file (inside the hugetlbfs mount path) used for mmap >=20 > The soft offline page attempts fail as expected as we dont support that o= n > POWER yet. >=20 > [ 145.359405] soft offline: 0xba100: migration failed 1, type 43ffff0000= 04008 > [ 145.359572] soft offline: 0xba200: migration failed 1, type 43ffff0000= 04008 > [ 145.359644] soft offline: 0xba300: migration failed 1, type 43ffff0000= 04008 > [ 145.359716] soft offline: 0xba400: migration failed 1, type 43ffff0000= 04008 > [ 145.359783] soft offline: 0xba500: migration failed 1, type 43ffff0000= 04008 > [ 145.359853] soft offline: 0xba600: migration failed 1, type 43ffff0000= 04008 > [ 145.359931] soft offline: 0xba700: migration failed 1, type 43ffff0000= 04008 > [ 145.360008] soft offline: 0xba800: migration failed 1, type 43ffff0000= 04008 > [ 145.360080] soft offline: 0xba900: migration failed 1, type 43ffff0000= 04008 > [ 145.360148] soft offline: 0xbaa00: migration failed 1, type 43ffff0000= 04008 > [ 145.360218] soft offline: 0xbab00: migration failed 1, type 43ffff0000= 04008 > [ 145.360286] soft offline: 0xbac00: migration failed 1, type 43ffff0000= 04008 > [ 145.360353] soft offline: 0xbad00: migration failed 1, type 43ffff0000= 04008 > [ 145.360420] soft offline: 0xbae00: migration failed 1, type 43ffff0000= 04008 > [ 145.360485] soft offline: 0xbaf00: migration failed 1, type 43ffff0000= 04008 > [ 145.360552] soft offline: 0xbb000: migration failed 1, type 43ffff0000= 04008 >=20 > But interestingly the test thread hangs for good after that. We can see t= his > soft lock up message in the console. >=20 > [ 168.038353] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [hug= epage-mmap:1238] > [ 168.038384] Modules linked in: windfarm_smu_sat i2c_core pseries_rng w= indfarm_pid xfs libcrc32c ibmvscsi scsi_transport_srp ibmveth > [ 168.038411] CPU: 3 PID: 1238 Comm: hugepage-mmap Not tainted 4.2.0+ #7 > [ 168.038416] task: c0000000ffa304b0 ti: c000000beff8c000 task.ti: c0000= 00beff8c000 > [ 168.038422] NIP: c000000000228164 LR: c00000000022820c CTR: c000000000= 3ff060 > [ 168.038428] REGS: c000000beff8f8c0 TRAP: 0901 Not tainted (4.2.0+) > [ 168.038432] MSR: 8000000000009033 CR: 22000228= XER: 00000000 > [ 168.038446] CFAR: c000000000228180 SOFTE: 1=20 > GPR00: 0000000000000000 c000000beff8fb40 c000000001306000 f000000002e8400= 0=20 > GPR04: c000000beff8fb60 0000000000000220 c000000bf98b5248 000000000000000= 0=20 > GPR08: 0000000000000044 0000000000000000 f000000002e8401c 000000000000000= 0=20 > GPR12: 0000000000000000 c00000000ea51c80=20 > [ 168.038483] NIP [c000000000228164] find_get_pages+0xa4/0x1e0 > [ 168.038489] LR [c00000000022820c] find_get_pages+0x14c/0x1e0 > [ 168.038493] Call Trace: > [ 168.038496] [c000000beff8fb40] [c000000beff8fbc0] 0xc000000beff8fbc0 (= unreliable) > [ 168.038503] [c000000beff8fbc0] [c00000000023ba20] pagevec_lookup+0x50/= 0x80 > [ 168.038510] [c000000beff8fc00] [c0000000003fe630] truncate_hugepages+0= xb0/0x280 > [ 168.038517] [c000000beff8fd10] [c0000000003ff090] hugetlbfs_evict_inod= e+0x30/0x70 > [ 168.038524] [c000000beff8fd40] [c0000000002f1d34] evict+0xf4/0x290 > [ 168.038531] [c000000beff8fd80] [c0000000002e3268] do_unlinkat+0x208/0x= 370 > [ 168.038538] [c000000beff8fe30] [c000000000009360] system_call+0x38/0xd= 0 > [ 168.038543] Instruction dump: > [ 168.038546] 419e00c8 e87f0000 2fa30000 419e0080 786907a1 408200d8 813b= 000c 552902ee=20 > [ 168.038556] 0b090000 3943001c 7c2004ac 7d205028 <2c090000> 41c20014 31= 090001 7d00512d=20 >=20 > After a while, we see these messages as well on the console before it fre= ezes. >=20 > INFO: rcu_sched self-detected stall on CPU > [ 205.368370] 3: (5999 ticks this GP) idle=3Db23/140000000000001/0 soft= irq=3D1329/1329 fqs=3D6000=20 > [ 205.368375] (t=3D6000 jiffies g=3D896 c=3D895 q=3D678) > [ 205.368383] Task dump for CPU 3: > [ 205.368387] hugepage-mmap R running task 0 1238 1233 0x00= 042004 > [ 205.368394] Call Trace: > [ 205.368401] [c000000beff8f3e0] [c000000000106660] sched_show_task+0xe0= /0x180 (unreliable) > [ 205.368409] [c000000beff8f450] [c00000000014d9e4] rcu_dump_cpu_stacks+= 0xe4/0x150 > [ 205.368416] [c000000beff8f4a0] [c0000000001529c4] rcu_check_callbacks+= 0x6b4/0x9b0 > [ 205.368424] [c000000beff8f5d0] [c000000000159ee4] update_process_times= +0x54/0xa0 > [ 205.368431] [c000000beff8f600] [c000000000170b48] tick_sched_handle.is= ra.6+0x48/0xe0 > [ 205.368437] [c000000beff8f640] [c000000000170c44] tick_sched_timer+0x6= 4/0xd0 > [ 205.368444] [c000000beff8f680] [c00000000015a954] __hrtimer_run_queues= +0x124/0x3c0 > [ 205.368450] [c000000beff8f710] [c00000000015b88c] hrtimer_interrupt+0x= ec/0x2c0 > [ 205.368457] [c000000beff8f7d0] [c00000000001f77c] __timer_interrupt+0x= 8c/0x230 > [ 205.368464] [c000000beff8f820] [c00000000001fe30] timer_interrupt+0xa0= /0xe0 > [ 205.368470] [c000000beff8f850] [c000000000002768] decrementer_common+0= x168/0x180 > [ 205.368478] --- interrupt: 901 at find_get_pages+0xa0/0x1e0 > [ 205.368478] LR =3D find_get_pages+0x14c/0x1e0 > [ 205.368486] [c000000beff8fb40] [c000000beff8fbc0] 0xc000000beff8fbc0 (= unreliable) > [ 205.368493] [c000000beff8fbc0] [c00000000023ba20] pagevec_lookup+0x50/= 0x80 > [ 205.368500] [c000000beff8fc00] [c0000000003fe630] truncate_hugepages+0= xb0/0x280 > [ 205.368507] [c000000beff8fd10] [c0000000003ff090] hugetlbfs_evict_inod= e+0x30/0x70 > [ 205.368513] [c000000beff8fd40] [c0000000002f1d34] evict+0xf4/0x290 > [ 205.368520] [c000000beff8fd80] [c0000000002e3268] do_unlinkat+0x208/0x= 370 > [ 205.368526] [c000000beff8fe30] [c000000000009360] system_call+0x38/0xd= 0 >=20 >=20 > Then the entire system hangs for good, requiring a reboot. The problem ca= n be recreated > without unlink() step if we just try to unmount the hugetlbfs afterwards.= I am still > looking into this problem. Please do let me know if you have any suggesti= ons/thoughts > or comments about this issue. >=20 > Regards > Anshuman >=20