* [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
@ 2023-06-20 12:11 Sachin Sant
2023-06-20 19:34 ` Yu Zhao
2023-06-21 3:52 ` Michael Ellerman
0 siblings, 2 replies; 5+ messages in thread
From: Sachin Sant @ 2023-06-20 12:11 UTC (permalink / raw)
To: linux-mm; +Cc: linuxppc-dev
6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following
[ 5.548368] BUG: Unable to handle kernel data access at 0x95bdcf954bc34e73
[ 5.548380] Faulting instruction address: 0xc000000000548090
[ 5.548384] Oops: Kernel access of bad area, sig: 11 [#1]
[ 5.548387] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 5.548391] Modules linked in: nf_tables(E) nfnetlink(E) sunrpc(E) binfmt_misc(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[ 5.548413] CPU: 1 PID: 789 Comm: systemd-udevd Tainted: G E 6.4.0-rc7-next-20230620 #1
[ 5.548417] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[ 5.548421] NIP: c000000000548090 LR: c000000000547fbc CTR: c0000000004206f0
[ 5.548424] REGS: c0000000afb536f0 TRAP: 0380 Tainted: G E (6.4.0-rc7-next-20230620)
[ 5.548427] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88028202 XER: 20040000
[ 5.548436] CFAR: c000000000547fc4 IRQMASK: 0
[ 5.548436] GPR00: c000000000547fbc c0000000afb53990 c0000000014b1600 0000000000000000
[ 5.548436] GPR04: 0000000000000cc0 00000000000034d8 0000000000000e6f ed5e02cab43c21e0
[ 5.548436] GPR08: 0000000000000e6e 0000000000000058 0000001356ea0000 0000000000002000
[ 5.548436] GPR12: c0000000004206f0 c0000013fffff300 0000000000000000 0000000000000000
[ 5.548436] GPR16: 0000000000000000 0000000000000000 0000000000000000 c000000092f43708
[ 5.548436] GPR20: c000000092f436b0 0000000000000000 fffffffffff7dfff c0000000afa80000
[ 5.548436] GPR24: c000000002b87aa0 00000000000000b8 c000000000159914 0000000000000cc0
[ 5.548436] GPR28: 95bdcf954bc34e1b c00000000a1fafc0 0000000000000000 c000000003019800
[ 5.548473] NIP [c000000000548090] kmem_cache_alloc+0x1a0/0x420
[ 5.548480] LR [c000000000547fbc] kmem_cache_alloc+0xcc/0x420
[ 5.548485] Call Trace:
[ 5.548487] [c0000000afb53990] [c000000000547fbc] kmem_cache_alloc+0xcc/0x420 (unreliable)
[ 5.548493] [c0000000afb53a00] [c000000000159914] vm_area_dup+0x44/0xf0
[ 5.548499] [c0000000afb53a40] [c00000000015a638] dup_mmap+0x298/0x8b0
[ 5.548504] [c0000000afb53bb0] [c00000000015acd0] dup_mm.constprop.0+0x80/0x180
[ 5.548509] [c0000000afb53bf0] [c00000000015bdc0] copy_process+0xc00/0x1510
[ 5.548514] [c0000000afb53cb0] [c00000000015c848] kernel_clone+0xb8/0x5a0
[ 5.548519] [c0000000afb53d30] [c00000000015ceb8] __do_sys_clone+0x88/0xd0
[ 5.548524] [c0000000afb53e10] [c000000000033bcc] system_call_exception+0x13c/0x340
[ 5.548529] [c0000000afb53e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[ 5.548534] --- interrupt: 3000 at 0x7fff87f0c178
[ 5.548538] NIP: 00007fff87f0c178 LR: 0000000000000000 CTR: 0000000000000000
[ 5.548540] REGS: c0000000afb53e80 TRAP: 3000 Tainted: G E (6.4.0-rc7-next-20230620)
[ 5.548544] MSR: 800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44004204 XER: 00000000
[ 5.548552] IRQMASK: 0
[ 5.548552] GPR00: 0000000000000078 00007ffffde8cb80 00007fff88637500 0000000001200011
[ 5.548552] GPR04: 0000000000000000 0000000000000000 0000000000000000 00007fff888bd490
[ 5.548552] GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000
[ 5.548552] GPR12: 0000000000000000 00007fff888c4c00 0000000000000002 00007ffffde95698
[ 5.548552] GPR16: 00007ffffde95690 00007ffffde95688 00007ffffde956a0 0000000000000028
[ 5.548552] GPR20: 0000000132bca308 0000000000000001 0000000000000001 0000000000000315
[ 5.548552] GPR24: 0000000000000003 0000000000000040 0000000000000000 0000000000000003
[ 5.548552] GPR28: 0000000000000000 0000000000000000 00007ffffde8cf24 0000000000000045
[ 5.548586] NIP [00007fff87f0c178] 0x7fff87f0c178
[ 5.548589] LR [0000000000000000] 0x0
[ 5.548591] --- interrupt: 3000
[ 5.548593] Code: e93f0000 7ce95214 e9070008 7f89502a e9270010 2e3c0000 41920258 2c290000 41820250 813f0028 e8ff00b8 38c80001 <7fdc482a> 7d3c4a14 79250022 552ac03e
[ 5.548605] ---[ end trace 0000000000000000 ]---
[ 5.550849] pstore: backend (nvram) writing error (-1)
[ 5.550852]
Starting Network Manager...
[ 5.566384] BUG: Bad rss-counter state mm:00000000dc60f1c1 type:MM_ANONPAGES val:36
[ 5.568784] BUG: Bad rss-counter state mm:000000008eb9341b type:MM_ANONPAGES val:36
[ 5.689774] BUG: Bad rss-counter state mm:00000000edbda345 type:MM_ANONPAGES val:36
[ 5.692187] BUG: Bad rss-counter state mm:000000003f7ec21f type:MM_ANONPAGES val:36
[ 5.705947] BUG: Bad rss-counter state mm:00000000cdbb7cfd type:MM_ANONPAGES val:36
[ 6.550855] Kernel panic - not syncing: Fatal exception
[ 6.568226] Rebooting in 10 seconds..
The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
result reported by it. Bisect points to following patch
# git bisect bad
70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
Merge: 48f5ee5c48c3 3fe08f7d5e80
Author: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Tue Jun 20 09:43:25 2023 +1000
Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
# Conflicts:
# mm/mmap.c
git bisect start
# status: waiting for both good and bad commits
# bad: [9dbf40840551df336c95ce2a3adbdd25ed53c0ef] Add linux-next specific files for 20230620
git bisect bad 9dbf40840551df336c95ce2a3adbdd25ed53c0ef
# status: waiting for good commit(s), bad commit known
# good: [45a3e24f65e90a047bef86f927ebdc4c710edaa1] Linux 6.4-rc7
git bisect good 45a3e24f65e90a047bef86f927ebdc4c710edaa1
# bad: [175cde0dcc05c0905adeb55dff5ac49da96552b3] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 175cde0dcc05c0905adeb55dff5ac49da96552b3
# bad: [d16e40b24a7d258d166fbfe46f0f565a21204df7] Merge branch 'xtensa-for-next' of git://github.com/jcmvbkbc/linux-xtensa.git
git bisect bad d16e40b24a7d258d166fbfe46f0f565a21204df7
# bad: [2be5f21481bf5606654520c19bd016090522f5d4] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl.git
git bisect bad 2be5f21481bf5606654520c19bd016090522f5d4
# bad: [1dfd9944d721bef26f49d00220ce86efeb77711d] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git
git bisect bad 1dfd9944d721bef26f49d00220ce86efeb77711d
# good: [34fd86722257374f73bb6da13a60cc19b0344e99] mm: zswap: remove shrink from zpool interface
git bisect good 34fd86722257374f73bb6da13a60cc19b0344e99
# good: [48f5ee5c48c342bd82fa04eefc8a41048a6165fc] Merge branch 'mm-nonmm-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect good 48f5ee5c48c342bd82fa04eefc8a41048a6165fc
# good: [dfd058ab9bef3f6590fb349ae1a2dfa7fc3ee50e] mm/gup: do not return 0 from pin_user_pages_fast() for bad args
git bisect good dfd058ab9bef3f6590fb349ae1a2dfa7fc3ee50e
# good: [ec336aa83162fe0f3d554baed2d4e2589b69ec6e] scripts/mksysmap: Fix badly escaped '$'
git bisect good ec336aa83162fe0f3d554baed2d4e2589b69ec6e
# good: [b08e8297596bb6f80351dc50fc1b8c2250d3a318] modpost: show offset from symbol for section mismatch warnings
git bisect good b08e8297596bb6f80351dc50fc1b8c2250d3a318
# good: [14b17c0b28bbd853c43d1a815019091497b5b436] watchdog/hardlockup: sort hardlockup detector related config values a logical way
git bisect good 14b17c0b28bbd853c43d1a815019091497b5b436
# good: [1e5db612cc70f3137aa48978b267afff17eb222d] watchdog/hardlockup: define HARDLOCKUP_DETECTOR_ARCH
git bisect good 1e5db612cc70f3137aa48978b267afff17eb222d
# good: [3fe08f7d5e80b3f822673b70fcc6be8dbee58f76] Merge branch 'mm-nonmm-unstable' into mm-everything
git bisect good 3fe08f7d5e80b3f822673b70fcc6be8dbee58f76
# good: [9ac40f75debfcb20c93de71b434ae73add1f692d] linux/export.h: rename 'sec' argument to 'license'
git bisect good 9ac40f75debfcb20c93de71b434ae73add1f692d
# bad: [70c94cc2eefd4f98d222834cbe7512804977c2d4] Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect bad 70c94cc2eefd4f98d222834cbe7512804977c2d4
# first bad commit: [70c94cc2eefd4f98d222834cbe7512804977c2d4] Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
- Sachin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
2023-06-20 12:11 [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR Sachin Sant
@ 2023-06-20 19:34 ` Yu Zhao
2023-06-21 3:52 ` Michael Ellerman
1 sibling, 0 replies; 5+ messages in thread
From: Yu Zhao @ 2023-06-20 19:34 UTC (permalink / raw)
To: Sachin Sant; +Cc: linux-mm, linuxppc-dev
On Tue, Jun 20, 2023 at 05:41:57PM +0530, Sachin Sant wrote:
> 6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following
Sorry for hijacking this thread -- I've been seeing another crash on
NV since -rc1 but I haven't had the time to bisect. Just FYI.
[ 0.814500] BUG: Unable to handle kernel data access on read at 0xc00a0000000003f9
[ 0.814814] Faulting instruction address: 0xc000000000c77324
[ 0.814988] Oops: Kernel access of bad area, sig: 11 [#1]
[ 0.815185] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[ 0.815487] Modules linked in:
[ 0.815653] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc7 #1
[ 0.815980] Hardware name: ZAIUS_FX_10 POWER9 (raw) 0x4e1202 opal:custom PowerNV
[ 0.816293] NIP: c000000000c77324 LR: c000000000c7c2c8 CTR: c000000000c77270
[ 0.816525] REGS: c00020000a416de0 TRAP: 0300 Not tainted (6.4.0-rc7)
[ 0.816778] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24008280 XER: 20040000
[ 0.817113] CFAR: c000000000c772c8 DAR: c00a0000000003f9 DSISR: 40000000 IRQMASK: 1
[ 0.817113] GPR00: c000000000c7c2c8 c00020000a417080 c000000001deea00 00000000000003f9
[ 0.817113] GPR04: 0000000000000001 0000000000000000 c0000000027f3b28 c0000000016e0610
[ 0.817113] GPR08: 0000000000000000 c000000002b9db10 c000000002b903e0 0000000084008282
[ 0.817113] GPR12: 0000000000000000 c000003ffffdba00 c0000000000128c8 0000000000000000
[ 0.817113] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
[ 0.817113] GPR20: c0000000027903f0 c0000000020034a4 c0000000029db000 0000000000000000
[ 0.817113] GPR24: c0000000029dad70 c0000000029dad50 0000000000000000 0000000000000001
[ 0.817113] GPR28: c000000002d57378 0000000000000000 c000000002d47378 c00a0000000003f9
[ 0.820185] NIP [c000000000c77324] io_serial_in+0xb4/0x130
[ 0.820383] LR [c000000000c7c2c8] serial8250_config_port+0x4b8/0x1680
[ 0.820610] Call Trace:
[ 0.820733] [c00020000a417080] [c000000000c768c8] serial8250_request_std_resource+0x88/0x200 (unreliable)
[ 0.821221] [c00020000a4170c0] [c000000000c7be50] serial8250_config_port+0x40/0x1680
[ 0.821623] [c00020000a417190] [c000000000c733ec] univ8250_config_port+0x11c/0x1e0
[ 0.821956] [c00020000a4171f0] [c000000000c71824] uart_add_one_port+0x244/0x750
[ 0.822244] [c00020000a417310] [c000000000c73958] serial8250_register_8250_port+0x3b8/0x780
[ 0.822504] [c00020000a4173c0] [c000000000c7411c] serial8250_probe+0x14c/0x1e0
[ 0.822833] [c00020000a417760] [c000000000cd87e8] platform_probe+0x98/0x1b0
[ 0.823157] [c00020000a4177e0] [c000000000cd2a50] really_probe+0x130/0x5b0
[ 0.823517] [c00020000a417870] [c000000000cd2f94] __driver_probe_device+0xc4/0x240
[ 0.823827] [c00020000a4178f0] [c000000000cd3164] driver_probe_device+0x54/0x180
[ 0.824096] [c00020000a417930] [c000000000cd3618] __driver_attach+0x168/0x300
[ 0.824330] [c00020000a4179b0] [c000000000ccf468] bus_for_each_dev+0xa8/0x130
[ 0.824650] [c00020000a417a10] [c000000000cd1ef4] driver_attach+0x34/0x50
[ 0.825094] [c00020000a417a30] [c000000000cd112c] bus_add_driver+0x16c/0x310
[ 0.825445] [c00020000a417ac0] [c000000000cd55d4] driver_register+0xa4/0x1b0
[ 0.825787] [c00020000a417b30] [c000000000cd7548] __platform_driver_register+0x38/0x50
[ 0.826037] [c00020000a417b50] [c00000000206be38] serial8250_init+0x1f8/0x270
[ 0.826267] [c00020000a417c00] [c000000000012260] do_one_initcall+0x60/0x300
[ 0.826529] [c00020000a417ce0] [c0000000020052c4] kernel_init_freeable+0x3c0/0x484
[ 0.826883] [c00020000a417de0] [c0000000000128f4] kernel_init+0x34/0x1e0
[ 0.827187] [c00020000a417e50] [c00000000000d014] ret_from_kernel_user_thread+0x14/0x1c
[ 0.827595] --- interrupt: 0 at 0x0
[ 0.827722] NIP: 0000000000000000 LR: 0000000000000000 CTR: 0000000000000000
[ 0.827951] REGS: c00020000a417e80 TRAP: 0000 Not tainted (6.4.0-rc7)
[ 0.828162] MSR: 0000000000000000 <> CR: 00000000 XER: 00000000
[ 0.828394] CFAR: 0000000000000000 IRQMASK: 0
[ 0.828394] GPR00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR12: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.831329] NIP [0000000000000000] 0x0
[ 0.831445] LR [0000000000000000] 0x0
[ 0.831554] --- interrupt: 0
[ 0.831664] Code: 7bc30020 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 60000000 60420000 3d2200db 3929f110 ebe90000 7fe3fa14 7c0004ac <8bdf0000> 0c1e0000 4c00012c 57de063e
[ 0.832263] ---[ end trace 0000000000000000 ]---
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
2023-06-20 12:11 [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR Sachin Sant
2023-06-20 19:34 ` Yu Zhao
@ 2023-06-21 3:52 ` Michael Ellerman
2023-06-22 8:01 ` Sachin Sant
1 sibling, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2023-06-21 3:52 UTC (permalink / raw)
To: Sachin Sant, linux-mm; +Cc: linuxppc-dev
Sachin Sant <sachinp@linux.ibm.com> writes:
> 6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following
>
> [ 5.548368] BUG: Unable to handle kernel data access at 0x95bdcf954bc34e73
> [ 5.548380] Faulting instruction address: 0xc000000000548090
> [ 5.548384] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 5.548387] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [ 5.548391] Modules linked in: nf_tables(E) nfnetlink(E) sunrpc(E) binfmt_misc(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
> [ 5.548413] CPU: 1 PID: 789 Comm: systemd-udevd Tainted: G E 6.4.0-rc7-next-20230620 #1
> [ 5.548417] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [ 5.548421] NIP: c000000000548090 LR: c000000000547fbc CTR: c0000000004206f0
> [ 5.548424] REGS: c0000000afb536f0 TRAP: 0380 Tainted: G E (6.4.0-rc7-next-20230620)
> [ 5.548427] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88028202 XER: 20040000
> [ 5.548436] CFAR: c000000000547fc4 IRQMASK: 0
> [ 5.548436] GPR00: c000000000547fbc c0000000afb53990 c0000000014b1600 0000000000000000
> [ 5.548436] GPR04: 0000000000000cc0 00000000000034d8 0000000000000e6f ed5e02cab43c21e0
> [ 5.548436] GPR08: 0000000000000e6e 0000000000000058 0000001356ea0000 0000000000002000
> [ 5.548436] GPR12: c0000000004206f0 c0000013fffff300 0000000000000000 0000000000000000
> [ 5.548436] GPR16: 0000000000000000 0000000000000000 0000000000000000 c000000092f43708
> [ 5.548436] GPR20: c000000092f436b0 0000000000000000 fffffffffff7dfff c0000000afa80000
> [ 5.548436] GPR24: c000000002b87aa0 00000000000000b8 c000000000159914 0000000000000cc0
> [ 5.548436] GPR28: 95bdcf954bc34e1b c00000000a1fafc0 0000000000000000 c000000003019800
> [ 5.548473] NIP [c000000000548090] kmem_cache_alloc+0x1a0/0x420
> [ 5.548480] LR [c000000000547fbc] kmem_cache_alloc+0xcc/0x420
> [ 5.548485] Call Trace:
> [ 5.548487] [c0000000afb53990] [c000000000547fbc] kmem_cache_alloc+0xcc/0x420 (unreliable)
> [ 5.548493] [c0000000afb53a00] [c000000000159914] vm_area_dup+0x44/0xf0
> [ 5.548499] [c0000000afb53a40] [c00000000015a638] dup_mmap+0x298/0x8b0
> [ 5.548504] [c0000000afb53bb0] [c00000000015acd0] dup_mm.constprop.0+0x80/0x180
> [ 5.548509] [c0000000afb53bf0] [c00000000015bdc0] copy_process+0xc00/0x1510
> [ 5.548514] [c0000000afb53cb0] [c00000000015c848] kernel_clone+0xb8/0x5a0
> [ 5.548519] [c0000000afb53d30] [c00000000015ceb8] __do_sys_clone+0x88/0xd0
> [ 5.548524] [c0000000afb53e10] [c000000000033bcc] system_call_exception+0x13c/0x340
> [ 5.548529] [c0000000afb53e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
> [ 5.548534] --- interrupt: 3000 at 0x7fff87f0c178
> [ 5.548538] NIP: 00007fff87f0c178 LR: 0000000000000000 CTR: 0000000000000000
> [ 5.548540] REGS: c0000000afb53e80 TRAP: 3000 Tainted: G E (6.4.0-rc7-next-20230620)
> [ 5.548544] MSR: 800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44004204 XER: 00000000
> [ 5.548552] IRQMASK: 0
> [ 5.548552] GPR00: 0000000000000078 00007ffffde8cb80 00007fff88637500 0000000001200011
> [ 5.548552] GPR04: 0000000000000000 0000000000000000 0000000000000000 00007fff888bd490
> [ 5.548552] GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000
> [ 5.548552] GPR12: 0000000000000000 00007fff888c4c00 0000000000000002 00007ffffde95698
> [ 5.548552] GPR16: 00007ffffde95690 00007ffffde95688 00007ffffde956a0 0000000000000028
> [ 5.548552] GPR20: 0000000132bca308 0000000000000001 0000000000000001 0000000000000315
> [ 5.548552] GPR24: 0000000000000003 0000000000000040 0000000000000000 0000000000000003
> [ 5.548552] GPR28: 0000000000000000 0000000000000000 00007ffffde8cf24 0000000000000045
> [ 5.548586] NIP [00007fff87f0c178] 0x7fff87f0c178
> [ 5.548589] LR [0000000000000000] 0x0
> [ 5.548591] --- interrupt: 3000
> [ 5.548593] Code: e93f0000 7ce95214 e9070008 7f89502a e9270010 2e3c0000 41920258 2c290000 41820250 813f0028 e8ff00b8 38c80001 <7fdc482a> 7d3c4a14 79250022 552ac03e
> [ 5.548605] ---[ end trace 0000000000000000 ]---
> [ 5.550849] pstore: backend (nvram) writing error (-1)
> [ 5.550852]
> Starting Network Manager...
> [ 5.566384] BUG: Bad rss-counter state mm:00000000dc60f1c1 type:MM_ANONPAGES val:36
> [ 5.568784] BUG: Bad rss-counter state mm:000000008eb9341b type:MM_ANONPAGES val:36
> [ 5.689774] BUG: Bad rss-counter state mm:00000000edbda345 type:MM_ANONPAGES val:36
> [ 5.692187] BUG: Bad rss-counter state mm:000000003f7ec21f type:MM_ANONPAGES val:36
> [ 5.705947] BUG: Bad rss-counter state mm:00000000cdbb7cfd type:MM_ANONPAGES val:36
> [ 6.550855] Kernel panic - not syncing: Fatal exception
> [ 6.568226] Rebooting in 10 seconds..
>
> The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
> result reported by it. Bisect points to following patch
>
> # git bisect bad
> 70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
> commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
> Merge: 48f5ee5c48c3 3fe08f7d5e80
> Author: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Tue Jun 20 09:43:25 2023 +1000
>
> Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> # Conflicts:
> # mm/mmap.c
Usually bisect pointing to a merge means something has gone wrong with
the bisect. It's not impossible for a merge to be the cause of a bug,
but IME it's rare.
In this case though the merge itself has a reasonably large diff, so
it's more likely that the merge itself has introduced a bug.
commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
Merge: 48f5ee5c48c3 3fe08f7d5e80
Author: Stephen Rothwell <sfr@canb.auug.org.au>
AuthorDate: Tue Jun 20 09:43:25 2023 +1000
Commit: Stephen Rothwell <sfr@canb.auug.org.au>
CommitDate: Tue Jun 20 09:43:25 2023 +1000
Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
# Conflicts:
# mm/mmap.c
diff --cc mm/mmap.c
index 98cda6f72605,474a0d856622..9a93b054148a
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@@ -2398,15 -2409,27 +2396,29 @@@ do_vmi_align_munmap(struct vma_iterato
if (error)
goto end_split_failed;
}
- mas_set(&mas_detach, count);
- error = munmap_sidetree(next, &mas_detach);
- if (error)
- goto munmap_sidetree_failed;
+ vma_start_write(next);
- mas_set_range(&mas_detach, next->vm_start, next->vm_end - 1);
+ if (mas_store_gfp(&mas_detach, next, GFP_KERNEL))
+ goto munmap_gather_failed;
+ vma_mark_detached(next, true);
+ if (next->vm_flags & VM_LOCKED)
+ locked_vm += vma_pages(next);
count++;
+ if (unlikely(uf)) {
+ /*
+ * If userfaultfd_unmap_prep returns an error the vmas
+ * will remain split, but userland will get a
+ * highly unexpected error anyway. This is no
+ * different than the case where the first of the two
+ * __split_vma fails, but we don't undo the first
+ * split, despite we could. This is unlikely enough
+ * failure that it's not worth optimizing it for.
+ */
+ error = userfaultfd_unmap_prep(next, start, end, uf);
+
+ if (error)
+ goto userfaultfd_error;
+ }
#ifdef CONFIG_DEBUG_VM_MAPLE_TREE
BUG_ON(next->vm_start < start);
BUG_ON(next->vm_start > end);
@@@ -2454,14 -2455,18 +2444,20 @@@
BUG_ON(count != test_count);
}
#endif
- /* Point of no return */
+ error = -ENOMEM;
- vma_iter_set(vmi, start);
+ while (vma_iter_addr(vmi) > start)
+ vma_iter_prev_range(vmi);
+
if (vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL))
- return -ENOMEM;
+ goto clear_tree_failed;
+ mm->locked_vm -= locked_vm;
mm->map_count -= count;
+ prev = vma_iter_prev_range(vmi);
+ next = vma_next(vmi);
+ if (next)
+ vma_iter_prev_range(vmi);
+
/*
* Do not downgrade mmap_lock if we are next to VM_GROWSDOWN or
* VM_GROWSUP VMA. Such VMAs can change their size under
cheers
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
2023-06-21 3:52 ` Michael Ellerman
@ 2023-06-22 8:01 ` Sachin Sant
2023-06-22 12:54 ` Bad linux-next merge? (was Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR) Michael Ellerman
0 siblings, 1 reply; 5+ messages in thread
From: Sachin Sant @ 2023-06-22 8:01 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linux-mm, linuxppc-dev
>> The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
>> result reported by it. Bisect points to following patch
>>
>> # git bisect bad
>> 70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
>> commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
>> Merge: 48f5ee5c48c3 3fe08f7d5e80
>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
>> Date: Tue Jun 20 09:43:25 2023 +1000
>>
>> Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>> # Conflicts:
>> # mm/mmap.c
>
> Usually bisect pointing to a merge means something has gone wrong with
> the bisect. It's not impossible for a merge to be the cause of a bug,
> but IME it's rare.
>
I have tried the bisect 3 times and the result was same. It always
points to this merge commit.
Is there anything else I can try to help debug this issue?
-Sachin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Bad linux-next merge? (was Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR)
2023-06-22 8:01 ` Sachin Sant
@ 2023-06-22 12:54 ` Michael Ellerman
0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2023-06-22 12:54 UTC (permalink / raw)
To: Sachin Sant; +Cc: linux-mm, linuxppc-dev, Stephen Rothwell
Sachin Sant <sachinp@linux.ibm.com> writes:
>>> The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
>>> result reported by it. Bisect points to following patch
>>>
>>> # git bisect bad
>>> 70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
>>> commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
>>> Merge: 48f5ee5c48c3 3fe08f7d5e80
>>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
>>> Date: Tue Jun 20 09:43:25 2023 +1000
>>>
>>> Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> # Conflicts:
>>> # mm/mmap.c
>>
>> Usually bisect pointing to a merge means something has gone wrong with
>> the bisect. It's not impossible for a merge to be the cause of a bug,
>> but IME it's rare.
>
> I have tried the bisect 3 times and the result was same. It always
> points to this merge commit.
>
> Is there anything else I can try to help debug this issue?
Looks like it's been reported, debugged and fixed over here:
https://lore.kernel.org/linux-next/20230619204309.GA13937@willie-the-truck/
So it should be resolved in today/tomorrow's linux-next hopefully.
cheers
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-06-22 12:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-20 12:11 [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR Sachin Sant
2023-06-20 19:34 ` Yu Zhao
2023-06-21 3:52 ` Michael Ellerman
2023-06-22 8:01 ` Sachin Sant
2023-06-22 12:54 ` Bad linux-next merge? (was Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR) Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).