From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Venkat Rao Bagalkote <venkat88@linux.ibm.com>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>,
Ritesh Harjani <ritesh.list@gmail.com>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
LKML <linux-kernel@vger.kernel.org>,
Srikar Dronamraju <srikar@linux.ibm.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [linux-next20260529] kernel BUG at kernel/sched/core.c:7512!
Date: Mon, 1 Jun 2026 14:46:24 +0530 [thread overview]
Message-ID: <2f8c3d75-de2c-48bf-bd05-46b816d55c69@linux.ibm.com> (raw)
In-Reply-To: <7904105b-9dfa-4efd-a5ef-bc0276ed255d@linux.ibm.com>
Hi Venkat. Thanks for the report.
+ mukesh, ritesh
On 6/1/26 12:11 PM, Venkat Rao Bagalkote wrote:
> Greetings!!!
>
>
> I hit a kernel BUG on a linux-next kernel running on ppc64le (Power11
> LPAR). The issue was observed once in CI (Avocado tests) and I haven’t
> been able to reproduce it reliably yet.
>
Can you run with lockdep and see if you can hit it?
> Architecture: ppc64le (Power11, pSeries)
> Kernel: 7.1.0-rc5-next-20260529
> Config: PREEMPT(lazy)
> CPUs: large system (NR_CPUS=8192)
>
This is with GENERIC_ENTRY.
>
> So far, I have not reproduced the crash, but I am trying to stress
> similar conditions using:
>
> parallel read workloads (fio / dd)
> memory pressure
>
>
> Traces:
>
> (5/8) /home/upstreamci/avocado-fvt-wrapper/tests/avocado-misc-tests/
> cpu/ppc64_cpu_test.py:PPC64Test.test_smt_loop;run-run_type-
> upstream-9cfe: STARTED
> [ 1885.176400] crash hp: kexec_trylock() failed, kdump image may be
> inaccurate
> [ 1885.296164] crash hp: kexec_trylock() failed, kdump image may be
> inaccurate
> [ 1885.386120] crash hp: kexec_trylock() failed, kdump image may be
> inaccurate
> [ 1885.556134] crash hp: kexec_trylock() failed, kdump image may be
> inaccurate
> [ 1886.576119] crash hp: kexec_trylock() failed, kdump image may be
> inaccurate
> [ 1886.806060] crash hp: kexec_trylock() failed, kdump image may be
> inaccurate
> [ 1887.026051] crash hp: kexec_trylock() failed, kdump image may be
> inaccurate
> [ 1887.456075] ------------[ cut here ]------------
> [ 1887.456101] kernel BUG at kernel/sched/core.c:7512!
> [ 1887.456107] Oops: Exception in kernel mode, sig: 5 [#1]
> [ 1887.456111] LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
> [ 1887.456116] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6
> nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding
> tls ip_set rfkill nf_tables fsdev_dax kmem device_dax pseries_rng
> vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2 sd_mod nd_pmem papr_scm
> sg libnvdimm ibmvscsi ibmveth scsi_transport_srp pseries_wdt
> [ 1887.456173] CPU: 28 UID: 0 PID: 85305 Comm: kexec Not tainted 7.1.0-
> rc5-next-20260529 #1 PREEMPT(lazy)
> [ 1887.456180] Hardware name: IBM,9080-HEX Power11 (architected)
> 0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries
> [ 1887.456185] NIP: c0000000013a8e8c LR: c0000000003483bc CTR:
> 0000000000000000
> [ 1887.456190] REGS: c000000069f03070 TRAP: 0700 Not tainted (7.1.0-
> rc5-next-20260529)
> [ 1887.456195] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR:
> 24428222 XER: 0000005a
> [ 1887.456208] CFAR: c0000000003483b8 IRQMASK: 0
> [ 1887.456208] GPR00: c0000000003483bc c000000069f03330 c000000001a82100
> c000000069f033e0
> [ 1887.456208] GPR04: 0000000000000000 0000000000000001 0000000000000001
> c000000006dd3b00
> [ 1887.456208] GPR08: ffffffffffffff00 0000000000000001 0000000000000000
> 0000000024428220
> [ 1887.456208] GPR12: 0000000000000300 c000000effdbef00 0000000000000000
> 0000000000000000
> [ 1887.456208] GPR16: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 1887.456208] GPR20: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 1887.456208] GPR24: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 1887.456208] GPR28: 0000000000000000 0000000000000000 0000000000000000
> c000000069f033e0
> [ 1887.456265] NIP [c0000000013a8e8c] preempt_schedule_irq+0x44/0x118
> [ 1887.456274] LR [c0000000003483bc]
> dynamic_irqentry_exit_cond_resched+0x40/0x1a4
> [ 1887.456282] Call Trace:
> [ 1887.456284] [c000000069f03360] [c0000000003483bc]
> dynamic_irqentry_exit_cond_resched+0x40/0x1a4
> [ 1887.456291] [c000000069f03380] [c00000000014f3bc]
> do_page_fault+0xc0/0x104
> [ 1887.456298] [c000000069f033b0] [c000000000008be0]
> data_access_common_virt+0x210/0x220
> [ 1887.456306] ---- interrupt: 300 at __copy_tofrom_user_base+0xac/0x5a4
> [ 1887.456313] NIP: c00000000017fc38 LR: c000000000aaa684 CTR:
> 0000000000000000
> [ 1887.456317] REGS: c000000069f033e0 TRAP: 0300 Not tainted (7.1.0-
> rc5-next-20260529)
> [ 1887.456322] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR:
> 24428220 XER: 2004005a
> [ 1887.456334] CFAR: c00000000017fc34 DAR: 00003fff879a8000 DSISR:
> 42000000 IRQMASK: 0
> [ 1887.456334] GPR00: 0000000000000000 c000000069f036a0 c000000001a82100
> 00003fff879a8000
> [ 1887.456334] GPR04: c0000000bb314ff0 0000000000001000 69f0000606480600
> 0200c4080368f028
> [ 1887.456334] GPR08: 09036af00005d9c4 0600000200e80803 0000000000000000
> 0000000000000030
> [ 1887.456334] GPR12: 0000000000000040 c000000effdbef00 0000000000000000
> 000000000000000e
> [ 1887.456334] GPR16: 0000000004a00000 000000000000001f c000000069f038a0
> c00000006e73e500
> [ 1887.456334] GPR20: c00000006f0ff6a8 0000000000000000 c00000006f0ff540
> 0000000000000001
> [ 1887.456334] GPR24: 000000001816ce60 c0000000bb314000 c000000002e48730
> c000000069f03a30
> [ 1887.456334] GPR28: c0000000bb314000 00003fff879a7010 0000000000000010
> 0000000000001000
> [ 1887.456393] NIP [c00000000017fc38] __copy_tofrom_user_base+0xac/0x5a4
> [ 1887.456399] LR [c000000000aaa684] raw_copy_to_user+0x12c/0x314
> [ 1887.456405] ---- interrupt: 300
> [ 1887.456408] [c000000069f036a0] [c000000000aaa5f4]
> raw_copy_to_user+0x9c/0x314 (unreliable)
> [ 1887.456416] [c000000069f036e0] [c000000000aacd08]
> _copy_to_iter+0xe4/0x79c
> [ 1887.456423] [c000000069f037a0] [c000000000ab01ec]
> copy_page_to_iter+0xd4/0x1a4
> [ 1887.456429] [c000000069f037f0] [c0000000005ddc34]
> filemap_read+0x420/0x4f0
> [ 1887.456436] [c000000069f039c0] [c0080000043443e0]
> ext4_file_read_iter+0x78/0x31c [ext4]
> [ 1887.456517] [c000000069f03a10] [c000000000796498] vfs_read+0x2a8/0x3c8
> [ 1887.456524] [c000000069f03ac0] [c00000000079726c] ksys_read+0x88/0x140
> [ 1887.456530] [c000000069f03b10] [c000000000032f98]
> system_call_exception+0x198/0x4e0
> [ 1887.456537] [c000000069f03e30] [c00000000000d05c]
> system_call_vectored_common+0x15c/0x2ec
> [ 1887.456544] ---- interrupt: 3000 at 0x3fff9b133cf4
> [ 1887.456549] NIP: 00003fff9b133cf4 LR: 00003fff9b133cf4 CTR:
> 0000000000000000
> [ 1887.456554] REGS: c000000069f03e60 TRAP: 3000 Not tainted (7.1.0-
> rc5-next-20260529)
> [ 1887.456558] MSR: 800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE> CR:
> 44424402 XER: 00000000
> [ 1887.456572] IRQMASK: 0
> [ 1887.456572] GPR00: 0000000000000003 00003fffe5fb4190 0000000105087f00
> 0000000000000003
> [ 1887.456572] GPR04: 00003fff82e93010 000000001816ce60 0000000000000022
> 0000000000000000
> [ 1887.456572] GPR08: 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> [ 1887.456572] GPR12: 0000000000000000 00003fff9b4cd860 000000010507f588
> 0000000000000000
> [ 1887.456572] GPR16: ffffffffffffffff 0000000000000000 0000000000000006
> 0000000000000000
> [ 1887.456572] GPR20: 0000000000000001 00003fff9b23039c 00003fff9b2303a0
> 00003fffe5fb5ee7
> [ 1887.456572] GPR24: 0000000000000000 0000000000000000 00003fffe5fb5ee7
> 00003fffe5fb42d0
> [ 1887.456572] GPR28: 0000000000000003 00003fff82e93010 000000001816ce60
> 0000000000000000
> [ 1887.456626] NIP [00003fff9b133cf4] 0x3fff9b133cf4
> [ 1887.456630] LR [00003fff9b133cf4] 0x3fff9b133cf4
> [ 1887.456634] ---- interrupt: 3000
> [ 1887.456637] Code: fbe1fff8 e92d0128 f8010010 f821ffd1 81490000
> 39200001 2c0a0000 40820014 892d0152 552907fe 7d290034 5529d97e
> <0b090000> 60000000 3bc00000 ebed0128
> [ 1887.456657] ---[ end trace 0000000000000000 ]---
>
>
> If you happen to fix this, please add below tag.
>
> Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>
Ritesh, Mukesh, Is below possible scenario?
do_page_fault seems to enable irq's in the interrupt handler?
is that expected? if so, one might see
-- do_page_fault (enter kernel mode)
-- enables interrupts
-- gets interrupt - Sets need_resched.
-- irqentry_exit - Sees it is kernel mode. Just checks preempt count
and calls preempt_schedule_irq, which catches both
preempt_count and !irqs_disabled. Hence the panic?
Should do_page_fault do preempt_disable when it enables the interrupts?
next prev parent reply other threads:[~2026-06-01 9:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-01 6:41 [linux-next20260529] kernel BUG at kernel/sched/core.c:7512! Venkat Rao Bagalkote
2026-06-01 9:16 ` Shrikanth Hegde [this message]
2026-06-01 9:56 ` Peter Zijlstra
2026-06-02 7:56 ` Shrikanth Hegde
2026-06-02 8:18 ` Peter Zijlstra
2026-06-02 9:56 ` Shrikanth Hegde
2026-06-02 10:03 ` Peter Zijlstra
2026-06-01 13:33 ` Venkat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2f8c3d75-de2c-48bf-bd05-46b816d55c69@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mchauras@linux.ibm.com \
--cc=peterz@infradead.org \
--cc=ritesh.list@gmail.com \
--cc=srikar@linux.ibm.com \
--cc=venkat88@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox