* [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
@ 2023-06-20 12:11 Sachin Sant
2023-06-20 19:34 ` Yu Zhao
2023-06-21 3:52 ` Michael Ellerman
0 siblings, 2 replies; 5+ messages in thread
From: Sachin Sant @ 2023-06-20 12:11 UTC (permalink / raw)
To: linux-mm; +Cc: linuxppc-dev
6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following
[ 5.548368] BUG: Unable to handle kernel data access at 0x95bdcf954bc34e73
[ 5.548380] Faulting instruction address: 0xc000000000548090
[ 5.548384] Oops: Kernel access of bad area, sig: 11 [#1]
[ 5.548387] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 5.548391] Modules linked in: nf_tables(E) nfnetlink(E) sunrpc(E) binfmt_misc(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[ 5.548413] CPU: 1 PID: 789 Comm: systemd-udevd Tainted: G E 6.4.0-rc7-next-20230620 #1
[ 5.548417] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[ 5.548421] NIP: c000000000548090 LR: c000000000547fbc CTR: c0000000004206f0
[ 5.548424] REGS: c0000000afb536f0 TRAP: 0380 Tainted: G E (6.4.0-rc7-next-20230620)
[ 5.548427] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88028202 XER: 20040000
[ 5.548436] CFAR: c000000000547fc4 IRQMASK: 0
[ 5.548436] GPR00: c000000000547fbc c0000000afb53990 c0000000014b1600 0000000000000000
[ 5.548436] GPR04: 0000000000000cc0 00000000000034d8 0000000000000e6f ed5e02cab43c21e0
[ 5.548436] GPR08: 0000000000000e6e 0000000000000058 0000001356ea0000 0000000000002000
[ 5.548436] GPR12: c0000000004206f0 c0000013fffff300 0000000000000000 0000000000000000
[ 5.548436] GPR16: 0000000000000000 0000000000000000 0000000000000000 c000000092f43708
[ 5.548436] GPR20: c000000092f436b0 0000000000000000 fffffffffff7dfff c0000000afa80000
[ 5.548436] GPR24: c000000002b87aa0 00000000000000b8 c000000000159914 0000000000000cc0
[ 5.548436] GPR28: 95bdcf954bc34e1b c00000000a1fafc0 0000000000000000 c000000003019800
[ 5.548473] NIP [c000000000548090] kmem_cache_alloc+0x1a0/0x420
[ 5.548480] LR [c000000000547fbc] kmem_cache_alloc+0xcc/0x420
[ 5.548485] Call Trace:
[ 5.548487] [c0000000afb53990] [c000000000547fbc] kmem_cache_alloc+0xcc/0x420 (unreliable)
[ 5.548493] [c0000000afb53a00] [c000000000159914] vm_area_dup+0x44/0xf0
[ 5.548499] [c0000000afb53a40] [c00000000015a638] dup_mmap+0x298/0x8b0
[ 5.548504] [c0000000afb53bb0] [c00000000015acd0] dup_mm.constprop.0+0x80/0x180
[ 5.548509] [c0000000afb53bf0] [c00000000015bdc0] copy_process+0xc00/0x1510
[ 5.548514] [c0000000afb53cb0] [c00000000015c848] kernel_clone+0xb8/0x5a0
[ 5.548519] [c0000000afb53d30] [c00000000015ceb8] __do_sys_clone+0x88/0xd0
[ 5.548524] [c0000000afb53e10] [c000000000033bcc] system_call_exception+0x13c/0x340
[ 5.548529] [c0000000afb53e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[ 5.548534] --- interrupt: 3000 at 0x7fff87f0c178
[ 5.548538] NIP: 00007fff87f0c178 LR: 0000000000000000 CTR: 0000000000000000
[ 5.548540] REGS: c0000000afb53e80 TRAP: 3000 Tainted: G E (6.4.0-rc7-next-20230620)
[ 5.548544] MSR: 800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44004204 XER: 00000000
[ 5.548552] IRQMASK: 0
[ 5.548552] GPR00: 0000000000000078 00007ffffde8cb80 00007fff88637500 0000000001200011
[ 5.548552] GPR04: 0000000000000000 0000000000000000 0000000000000000 00007fff888bd490
[ 5.548552] GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000
[ 5.548552] GPR12: 0000000000000000 00007fff888c4c00 0000000000000002 00007ffffde95698
[ 5.548552] GPR16: 00007ffffde95690 00007ffffde95688 00007ffffde956a0 0000000000000028
[ 5.548552] GPR20: 0000000132bca308 0000000000000001 0000000000000001 0000000000000315
[ 5.548552] GPR24: 0000000000000003 0000000000000040 0000000000000000 0000000000000003
[ 5.548552] GPR28: 0000000000000000 0000000000000000 00007ffffde8cf24 0000000000000045
[ 5.548586] NIP [00007fff87f0c178] 0x7fff87f0c178
[ 5.548589] LR [0000000000000000] 0x0
[ 5.548591] --- interrupt: 3000
[ 5.548593] Code: e93f0000 7ce95214 e9070008 7f89502a e9270010 2e3c0000 41920258 2c290000 41820250 813f0028 e8ff00b8 38c80001 <7fdc482a> 7d3c4a14 79250022 552ac03e
[ 5.548605] ---[ end trace 0000000000000000 ]---
[ 5.550849] pstore: backend (nvram) writing error (-1)
[ 5.550852]
Starting Network Manager...
[ 5.566384] BUG: Bad rss-counter state mm:00000000dc60f1c1 type:MM_ANONPAGES val:36
[ 5.568784] BUG: Bad rss-counter state mm:000000008eb9341b type:MM_ANONPAGES val:36
[ 5.689774] BUG: Bad rss-counter state mm:00000000edbda345 type:MM_ANONPAGES val:36
[ 5.692187] BUG: Bad rss-counter state mm:000000003f7ec21f type:MM_ANONPAGES val:36
[ 5.705947] BUG: Bad rss-counter state mm:00000000cdbb7cfd type:MM_ANONPAGES val:36
[ 6.550855] Kernel panic - not syncing: Fatal exception
[ 6.568226] Rebooting in 10 seconds..
The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
result reported by it. Bisect points to following patch
# git bisect bad
70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
Merge: 48f5ee5c48c3 3fe08f7d5e80
Author: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Tue Jun 20 09:43:25 2023 +1000
Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
# Conflicts:
# mm/mmap.c
git bisect start
# status: waiting for both good and bad commits
# bad: [9dbf40840551df336c95ce2a3adbdd25ed53c0ef] Add linux-next specific files for 20230620
git bisect bad 9dbf40840551df336c95ce2a3adbdd25ed53c0ef
# status: waiting for good commit(s), bad commit known
# good: [45a3e24f65e90a047bef86f927ebdc4c710edaa1] Linux 6.4-rc7
git bisect good 45a3e24f65e90a047bef86f927ebdc4c710edaa1
# bad: [175cde0dcc05c0905adeb55dff5ac49da96552b3] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 175cde0dcc05c0905adeb55dff5ac49da96552b3
# bad: [d16e40b24a7d258d166fbfe46f0f565a21204df7] Merge branch 'xtensa-for-next' of git://github.com/jcmvbkbc/linux-xtensa.git
git bisect bad d16e40b24a7d258d166fbfe46f0f565a21204df7
# bad: [2be5f21481bf5606654520c19bd016090522f5d4] Merge branch 'for-next' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl.git
git bisect bad 2be5f21481bf5606654520c19bd016090522f5d4
# bad: [1dfd9944d721bef26f49d00220ce86efeb77711d] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git
git bisect bad 1dfd9944d721bef26f49d00220ce86efeb77711d
# good: [34fd86722257374f73bb6da13a60cc19b0344e99] mm: zswap: remove shrink from zpool interface
git bisect good 34fd86722257374f73bb6da13a60cc19b0344e99
# good: [48f5ee5c48c342bd82fa04eefc8a41048a6165fc] Merge branch 'mm-nonmm-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect good 48f5ee5c48c342bd82fa04eefc8a41048a6165fc
# good: [dfd058ab9bef3f6590fb349ae1a2dfa7fc3ee50e] mm/gup: do not return 0 from pin_user_pages_fast() for bad args
git bisect good dfd058ab9bef3f6590fb349ae1a2dfa7fc3ee50e
# good: [ec336aa83162fe0f3d554baed2d4e2589b69ec6e] scripts/mksysmap: Fix badly escaped '$'
git bisect good ec336aa83162fe0f3d554baed2d4e2589b69ec6e
# good: [b08e8297596bb6f80351dc50fc1b8c2250d3a318] modpost: show offset from symbol for section mismatch warnings
git bisect good b08e8297596bb6f80351dc50fc1b8c2250d3a318
# good: [14b17c0b28bbd853c43d1a815019091497b5b436] watchdog/hardlockup: sort hardlockup detector related config values a logical way
git bisect good 14b17c0b28bbd853c43d1a815019091497b5b436
# good: [1e5db612cc70f3137aa48978b267afff17eb222d] watchdog/hardlockup: define HARDLOCKUP_DETECTOR_ARCH
git bisect good 1e5db612cc70f3137aa48978b267afff17eb222d
# good: [3fe08f7d5e80b3f822673b70fcc6be8dbee58f76] Merge branch 'mm-nonmm-unstable' into mm-everything
git bisect good 3fe08f7d5e80b3f822673b70fcc6be8dbee58f76
# good: [9ac40f75debfcb20c93de71b434ae73add1f692d] linux/export.h: rename 'sec' argument to 'license'
git bisect good 9ac40f75debfcb20c93de71b434ae73add1f692d
# bad: [70c94cc2eefd4f98d222834cbe7512804977c2d4] Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect bad 70c94cc2eefd4f98d222834cbe7512804977c2d4
# first bad commit: [70c94cc2eefd4f98d222834cbe7512804977c2d4] Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
- Sachin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
2023-06-20 12:11 [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR Sachin Sant
@ 2023-06-20 19:34 ` Yu Zhao
2023-06-21 3:52 ` Michael Ellerman
1 sibling, 0 replies; 5+ messages in thread
From: Yu Zhao @ 2023-06-20 19:34 UTC (permalink / raw)
To: Sachin Sant; +Cc: linux-mm, linuxppc-dev
On Tue, Jun 20, 2023 at 05:41:57PM +0530, Sachin Sant wrote:
> 6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following
Sorry for hijacking this thread -- I've been seeing another crash on
NV since -rc1 but I haven't had the time to bisect. Just FYI.
[ 0.814500] BUG: Unable to handle kernel data access on read at 0xc00a0000000003f9
[ 0.814814] Faulting instruction address: 0xc000000000c77324
[ 0.814988] Oops: Kernel access of bad area, sig: 11 [#1]
[ 0.815185] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[ 0.815487] Modules linked in:
[ 0.815653] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc7 #1
[ 0.815980] Hardware name: ZAIUS_FX_10 POWER9 (raw) 0x4e1202 opal:custom PowerNV
[ 0.816293] NIP: c000000000c77324 LR: c000000000c7c2c8 CTR: c000000000c77270
[ 0.816525] REGS: c00020000a416de0 TRAP: 0300 Not tainted (6.4.0-rc7)
[ 0.816778] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24008280 XER: 20040000
[ 0.817113] CFAR: c000000000c772c8 DAR: c00a0000000003f9 DSISR: 40000000 IRQMASK: 1
[ 0.817113] GPR00: c000000000c7c2c8 c00020000a417080 c000000001deea00 00000000000003f9
[ 0.817113] GPR04: 0000000000000001 0000000000000000 c0000000027f3b28 c0000000016e0610
[ 0.817113] GPR08: 0000000000000000 c000000002b9db10 c000000002b903e0 0000000084008282
[ 0.817113] GPR12: 0000000000000000 c000003ffffdba00 c0000000000128c8 0000000000000000
[ 0.817113] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
[ 0.817113] GPR20: c0000000027903f0 c0000000020034a4 c0000000029db000 0000000000000000
[ 0.817113] GPR24: c0000000029dad70 c0000000029dad50 0000000000000000 0000000000000001
[ 0.817113] GPR28: c000000002d57378 0000000000000000 c000000002d47378 c00a0000000003f9
[ 0.820185] NIP [c000000000c77324] io_serial_in+0xb4/0x130
[ 0.820383] LR [c000000000c7c2c8] serial8250_config_port+0x4b8/0x1680
[ 0.820610] Call Trace:
[ 0.820733] [c00020000a417080] [c000000000c768c8] serial8250_request_std_resource+0x88/0x200 (unreliable)
[ 0.821221] [c00020000a4170c0] [c000000000c7be50] serial8250_config_port+0x40/0x1680
[ 0.821623] [c00020000a417190] [c000000000c733ec] univ8250_config_port+0x11c/0x1e0
[ 0.821956] [c00020000a4171f0] [c000000000c71824] uart_add_one_port+0x244/0x750
[ 0.822244] [c00020000a417310] [c000000000c73958] serial8250_register_8250_port+0x3b8/0x780
[ 0.822504] [c00020000a4173c0] [c000000000c7411c] serial8250_probe+0x14c/0x1e0
[ 0.822833] [c00020000a417760] [c000000000cd87e8] platform_probe+0x98/0x1b0
[ 0.823157] [c00020000a4177e0] [c000000000cd2a50] really_probe+0x130/0x5b0
[ 0.823517] [c00020000a417870] [c000000000cd2f94] __driver_probe_device+0xc4/0x240
[ 0.823827] [c00020000a4178f0] [c000000000cd3164] driver_probe_device+0x54/0x180
[ 0.824096] [c00020000a417930] [c000000000cd3618] __driver_attach+0x168/0x300
[ 0.824330] [c00020000a4179b0] [c000000000ccf468] bus_for_each_dev+0xa8/0x130
[ 0.824650] [c00020000a417a10] [c000000000cd1ef4] driver_attach+0x34/0x50
[ 0.825094] [c00020000a417a30] [c000000000cd112c] bus_add_driver+0x16c/0x310
[ 0.825445] [c00020000a417ac0] [c000000000cd55d4] driver_register+0xa4/0x1b0
[ 0.825787] [c00020000a417b30] [c000000000cd7548] __platform_driver_register+0x38/0x50
[ 0.826037] [c00020000a417b50] [c00000000206be38] serial8250_init+0x1f8/0x270
[ 0.826267] [c00020000a417c00] [c000000000012260] do_one_initcall+0x60/0x300
[ 0.826529] [c00020000a417ce0] [c0000000020052c4] kernel_init_freeable+0x3c0/0x484
[ 0.826883] [c00020000a417de0] [c0000000000128f4] kernel_init+0x34/0x1e0
[ 0.827187] [c00020000a417e50] [c00000000000d014] ret_from_kernel_user_thread+0x14/0x1c
[ 0.827595] --- interrupt: 0 at 0x0
[ 0.827722] NIP: 0000000000000000 LR: 0000000000000000 CTR: 0000000000000000
[ 0.827951] REGS: c00020000a417e80 TRAP: 0000 Not tainted (6.4.0-rc7)
[ 0.828162] MSR: 0000000000000000 <> CR: 00000000 XER: 00000000
[ 0.828394] CFAR: 0000000000000000 IRQMASK: 0
[ 0.828394] GPR00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR12: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.828394] GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.831329] NIP [0000000000000000] 0x0
[ 0.831445] LR [0000000000000000] 0x0
[ 0.831554] --- interrupt: 0
[ 0.831664] Code: 7bc30020 eba1ffe8 ebc1fff0 ebe1fff8 4e800020 60000000 60420000 3d2200db 3929f110 ebe90000 7fe3fa14 7c0004ac <8bdf0000> 0c1e0000 4c00012c 57de063e
[ 0.832263] ---[ end trace 0000000000000000 ]---
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
2023-06-20 12:11 [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR Sachin Sant
2023-06-20 19:34 ` Yu Zhao
@ 2023-06-21 3:52 ` Michael Ellerman
2023-06-22 8:01 ` Sachin Sant
1 sibling, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2023-06-21 3:52 UTC (permalink / raw)
To: Sachin Sant, linux-mm; +Cc: linuxppc-dev
Sachin Sant <sachinp@linux.ibm.com> writes:
> 6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following
>
> [ 5.548368] BUG: Unable to handle kernel data access at 0x95bdcf954bc34e73
> [ 5.548380] Faulting instruction address: 0xc000000000548090
> [ 5.548384] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 5.548387] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [ 5.548391] Modules linked in: nf_tables(E) nfnetlink(E) sunrpc(E) binfmt_misc(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
> [ 5.548413] CPU: 1 PID: 789 Comm: systemd-udevd Tainted: G E 6.4.0-rc7-next-20230620 #1
> [ 5.548417] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [ 5.548421] NIP: c000000000548090 LR: c000000000547fbc CTR: c0000000004206f0
> [ 5.548424] REGS: c0000000afb536f0 TRAP: 0380 Tainted: G E (6.4.0-rc7-next-20230620)
> [ 5.548427] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88028202 XER: 20040000
> [ 5.548436] CFAR: c000000000547fc4 IRQMASK: 0
> [ 5.548436] GPR00: c000000000547fbc c0000000afb53990 c0000000014b1600 0000000000000000
> [ 5.548436] GPR04: 0000000000000cc0 00000000000034d8 0000000000000e6f ed5e02cab43c21e0
> [ 5.548436] GPR08: 0000000000000e6e 0000000000000058 0000001356ea0000 0000000000002000
> [ 5.548436] GPR12: c0000000004206f0 c0000013fffff300 0000000000000000 0000000000000000
> [ 5.548436] GPR16: 0000000000000000 0000000000000000 0000000000000000 c000000092f43708
> [ 5.548436] GPR20: c000000092f436b0 0000000000000000 fffffffffff7dfff c0000000afa80000
> [ 5.548436] GPR24: c000000002b87aa0 00000000000000b8 c000000000159914 0000000000000cc0
> [ 5.548436] GPR28: 95bdcf954bc34e1b c00000000a1fafc0 0000000000000000 c000000003019800
> [ 5.548473] NIP [c000000000548090] kmem_cache_alloc+0x1a0/0x420
> [ 5.548480] LR [c000000000547fbc] kmem_cache_alloc+0xcc/0x420
> [ 5.548485] Call Trace:
> [ 5.548487] [c0000000afb53990] [c000000000547fbc] kmem_cache_alloc+0xcc/0x420 (unreliable)
> [ 5.548493] [c0000000afb53a00] [c000000000159914] vm_area_dup+0x44/0xf0
> [ 5.548499] [c0000000afb53a40] [c00000000015a638] dup_mmap+0x298/0x8b0
> [ 5.548504] [c0000000afb53bb0] [c00000000015acd0] dup_mm.constprop.0+0x80/0x180
> [ 5.548509] [c0000000afb53bf0] [c00000000015bdc0] copy_process+0xc00/0x1510
> [ 5.548514] [c0000000afb53cb0] [c00000000015c848] kernel_clone+0xb8/0x5a0
> [ 5.548519] [c0000000afb53d30] [c00000000015ceb8] __do_sys_clone+0x88/0xd0
> [ 5.548524] [c0000000afb53e10] [c000000000033bcc] system_call_exception+0x13c/0x340
> [ 5.548529] [c0000000afb53e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
> [ 5.548534] --- interrupt: 3000 at 0x7fff87f0c178
> [ 5.548538] NIP: 00007fff87f0c178 LR: 0000000000000000 CTR: 0000000000000000
> [ 5.548540] REGS: c0000000afb53e80 TRAP: 3000 Tainted: G E (6.4.0-rc7-next-20230620)
> [ 5.548544] MSR: 800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44004204 XER: 00000000
> [ 5.548552] IRQMASK: 0
> [ 5.548552] GPR00: 0000000000000078 00007ffffde8cb80 00007fff88637500 0000000001200011
> [ 5.548552] GPR04: 0000000000000000 0000000000000000 0000000000000000 00007fff888bd490
> [ 5.548552] GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000
> [ 5.548552] GPR12: 0000000000000000 00007fff888c4c00 0000000000000002 00007ffffde95698
> [ 5.548552] GPR16: 00007ffffde95690 00007ffffde95688 00007ffffde956a0 0000000000000028
> [ 5.548552] GPR20: 0000000132bca308 0000000000000001 0000000000000001 0000000000000315
> [ 5.548552] GPR24: 0000000000000003 0000000000000040 0000000000000000 0000000000000003
> [ 5.548552] GPR28: 0000000000000000 0000000000000000 00007ffffde8cf24 0000000000000045
> [ 5.548586] NIP [00007fff87f0c178] 0x7fff87f0c178
> [ 5.548589] LR [0000000000000000] 0x0
> [ 5.548591] --- interrupt: 3000
> [ 5.548593] Code: e93f0000 7ce95214 e9070008 7f89502a e9270010 2e3c0000 41920258 2c290000 41820250 813f0028 e8ff00b8 38c80001 <7fdc482a> 7d3c4a14 79250022 552ac03e
> [ 5.548605] ---[ end trace 0000000000000000 ]---
> [ 5.550849] pstore: backend (nvram) writing error (-1)
> [ 5.550852]
> Starting Network Manager...
> [ 5.566384] BUG: Bad rss-counter state mm:00000000dc60f1c1 type:MM_ANONPAGES val:36
> [ 5.568784] BUG: Bad rss-counter state mm:000000008eb9341b type:MM_ANONPAGES val:36
> [ 5.689774] BUG: Bad rss-counter state mm:00000000edbda345 type:MM_ANONPAGES val:36
> [ 5.692187] BUG: Bad rss-counter state mm:000000003f7ec21f type:MM_ANONPAGES val:36
> [ 5.705947] BUG: Bad rss-counter state mm:00000000cdbb7cfd type:MM_ANONPAGES val:36
> [ 6.550855] Kernel panic - not syncing: Fatal exception
> [ 6.568226] Rebooting in 10 seconds..
>
> The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
> result reported by it. Bisect points to following patch
>
> # git bisect bad
> 70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
> commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
> Merge: 48f5ee5c48c3 3fe08f7d5e80
> Author: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Tue Jun 20 09:43:25 2023 +1000
>
> Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> # Conflicts:
> # mm/mmap.c
Usually bisect pointing to a merge means something has gone wrong with
the bisect. It's not impossible for a merge to be the cause of a bug,
but IME it's rare.
In this case though the merge itself has a reasonably large diff, so
it's more likely that the merge itself has introduced a bug.
commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
Merge: 48f5ee5c48c3 3fe08f7d5e80
Author: Stephen Rothwell <sfr@canb.auug.org.au>
AuthorDate: Tue Jun 20 09:43:25 2023 +1000
Commit: Stephen Rothwell <sfr@canb.auug.org.au>
CommitDate: Tue Jun 20 09:43:25 2023 +1000
Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
# Conflicts:
# mm/mmap.c
diff --cc mm/mmap.c
index 98cda6f72605,474a0d856622..9a93b054148a
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@@ -2398,15 -2409,27 +2396,29 @@@ do_vmi_align_munmap(struct vma_iterato
if (error)
goto end_split_failed;
}
- mas_set(&mas_detach, count);
- error = munmap_sidetree(next, &mas_detach);
- if (error)
- goto munmap_sidetree_failed;
+ vma_start_write(next);
- mas_set_range(&mas_detach, next->vm_start, next->vm_end - 1);
+ if (mas_store_gfp(&mas_detach, next, GFP_KERNEL))
+ goto munmap_gather_failed;
+ vma_mark_detached(next, true);
+ if (next->vm_flags & VM_LOCKED)
+ locked_vm += vma_pages(next);
count++;
+ if (unlikely(uf)) {
+ /*
+ * If userfaultfd_unmap_prep returns an error the vmas
+ * will remain split, but userland will get a
+ * highly unexpected error anyway. This is no
+ * different than the case where the first of the two
+ * __split_vma fails, but we don't undo the first
+ * split, despite we could. This is unlikely enough
+ * failure that it's not worth optimizing it for.
+ */
+ error = userfaultfd_unmap_prep(next, start, end, uf);
+
+ if (error)
+ goto userfaultfd_error;
+ }
#ifdef CONFIG_DEBUG_VM_MAPLE_TREE
BUG_ON(next->vm_start < start);
BUG_ON(next->vm_start > end);
@@@ -2454,14 -2455,18 +2444,20 @@@
BUG_ON(count != test_count);
}
#endif
- /* Point of no return */
+ error = -ENOMEM;
- vma_iter_set(vmi, start);
+ while (vma_iter_addr(vmi) > start)
+ vma_iter_prev_range(vmi);
+
if (vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL))
- return -ENOMEM;
+ goto clear_tree_failed;
+ mm->locked_vm -= locked_vm;
mm->map_count -= count;
+ prev = vma_iter_prev_range(vmi);
+ next = vma_next(vmi);
+ if (next)
+ vma_iter_prev_range(vmi);
+
/*
* Do not downgrade mmap_lock if we are next to VM_GROWSDOWN or
* VM_GROWSUP VMA. Such VMAs can change their size under
cheers
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR
2023-06-21 3:52 ` Michael Ellerman
@ 2023-06-22 8:01 ` Sachin Sant
2023-06-22 12:54 ` Bad linux-next merge? (was Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR) Michael Ellerman
0 siblings, 1 reply; 5+ messages in thread
From: Sachin Sant @ 2023-06-22 8:01 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linux-mm, linuxppc-dev
>> The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
>> result reported by it. Bisect points to following patch
>>
>> # git bisect bad
>> 70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
>> commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
>> Merge: 48f5ee5c48c3 3fe08f7d5e80
>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
>> Date: Tue Jun 20 09:43:25 2023 +1000
>>
>> Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>> # Conflicts:
>> # mm/mmap.c
>
> Usually bisect pointing to a merge means something has gone wrong with
> the bisect. It's not impossible for a merge to be the cause of a bug,
> but IME it's rare.
>
I have tried the bisect 3 times and the result was same. It always
points to this merge commit.
Is there anything else I can try to help debug this issue?
-Sachin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Bad linux-next merge? (was Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR)
2023-06-22 8:01 ` Sachin Sant
@ 2023-06-22 12:54 ` Michael Ellerman
0 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2023-06-22 12:54 UTC (permalink / raw)
To: Sachin Sant; +Cc: linux-mm, linuxppc-dev, Stephen Rothwell
Sachin Sant <sachinp@linux.ibm.com> writes:
>>> The problem was introduced in 6.4.0-rc7-next-20230619. I tried git bisect, but unsure of the
>>> result reported by it. Bisect points to following patch
>>>
>>> # git bisect bad
>>> 70c94cc2eefd4f98d222834cbe7512804977c2d4 is the first bad commit
>>> commit 70c94cc2eefd4f98d222834cbe7512804977c2d4
>>> Merge: 48f5ee5c48c3 3fe08f7d5e80
>>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
>>> Date: Tue Jun 20 09:43:25 2023 +1000
>>>
>>> Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> # Conflicts:
>>> # mm/mmap.c
>>
>> Usually bisect pointing to a merge means something has gone wrong with
>> the bisect. It's not impossible for a merge to be the cause of a bug,
>> but IME it's rare.
>
> I have tried the bisect 3 times and the result was same. It always
> points to this merge commit.
>
> Is there anything else I can try to help debug this issue?
Looks like it's been reported, debugged and fixed over here:
https://lore.kernel.org/linux-next/20230619204309.GA13937@willie-the-truck/
So it should be resolved in today/tomorrow's linux-next hopefully.
cheers
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-06-22 12:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-20 12:11 [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR Sachin Sant
2023-06-20 19:34 ` Yu Zhao
2023-06-21 3:52 ` Michael Ellerman
2023-06-22 8:01 ` Sachin Sant
2023-06-22 12:54 ` Bad linux-next merge? (was Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR) Michael Ellerman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.