Linux-Next discussions
 help / color / mirror / Atom feed
* linux-next crashes in scheduler on s390
@ 2026-06-10 12:47 Alexander Gordeev
  2026-06-10 13:03 ` Sven Schnelle
  0 siblings, 1 reply; 2+ messages in thread
From: Alexander Gordeev @ 2026-06-10 12:47 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	K Prateek Nayak, Mark Brown
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, John Stultz, Vineeth Pillai, Joel Fernandes,
	Heiko Carstens, Vasily Gorbik, linux-s390, linux-kernel,
	linux-next

Hi All,

Since about June 1st we're getting strace test suite (make -j$(nproc) check)
crashes on s390 in linux-next. Those are pretty easy to reproduce, but I
have not been able to nail it down to the particular commit/merge.

I am going to bisect it, but since we are approaching v7.1 release, any
hint would be greatly appreciated!

[ 2425.124912] ------------[ cut here ]------------
[ 2425.124926] WARNING: kernel/sched/sched.h:1792 at set_next_task_idle+0xd2/0x150, CPU#7: strace/893382
[ 2425.124937] Modules linked in: mptcp_diag xfrm_user xfrm_algo crypto_user tcp_diag inet_diag netlink_diag tls quota_v2 quota_tree tun overlay nls_iso8859_1 nls_cp437 exfat vfat fat loop sctp udp_tunnel ip6_udp_tunnel nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_rej
ect nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables s390_trng eadm_sch vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel drm i2c_core drm_panel_orientation_quirks dm_service_time diag288_wdt prng aes_s390 dm_mirror dm_region_hash dm_log zfcp scsi_transport_fc pkey_cca pkey_ep11 zcrypt pa
es_s390 phmac_s390 rng_core pkey_pckmo pkey scsi_dh_rdac crypto_engine scsi_dh_alua scsi_dh_emc dm_multipath autofs4 ecdsa_generic ecc sha512
[ 2425.125058] CPU: 7 UID: 1001 PID: 893382 Comm: strace Tainted: G        W           7.1.0-20260607.rc6.git0.6e845bcb78c9.300.fc44.s390x+next #1 PREEMPTLAZY
[ 2425.125065] Tainted: [W]=WARN
[ 2425.125068] Hardware name: IBM 3906 M03 703 (LPAR)
[ 2425.125071] Krnl PSW : 0404d00180000000 000003bf4c78da26 (set_next_task_idle+0xd6/0x150)
[ 2425.125080]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[ 2425.125085] Krnl GPRS: 0000000000000000 0000000000000154 0000000000000154 000003bf4e7baba8
[ 2425.125090]            0000000000000000 0000000000000154 000003bf00000000 000003bf4e2d9038
[ 2425.125093]            0000000081778000 000003bf4e7a5a00 0000000081778000 00000003ac6dba00
[ 2425.125097]            0000000000000000 0000000000000000 000003bf4c78d980 0000033f5137bac0
[ 2425.125108] Krnl Code: 000003bf4c78da18: e55dbf200001        clfhsi  3872(%r11),1
[ 2425.125108]            000003bf4c78da1e: a724ffb9            brc     2,000003bf4c78d990
[ 2425.125108]           *000003bf4c78da22: af000000            mc      0,0
[ 2425.125108]           >000003bf4c78da26: a7f4ffb5            brc     15,000003bf4c78d990
[ 2425.126081]  [<000003bf4c764f16>] prepare_task_switch+0x216/0x230
[ 2425.126087]  [<000003bf4d61b9b4>] __schedule+0x254/0x8d0
[ 2425.126091]  [<000003bf4d61c06c>] schedule+0x3c/0xc0
[ 2425.126095]  [<000003bf4c72b40c>] do_wait+0x6c/0x190
[ 2425.126100]  [<000003bf4c72b9da>] kernel_wait4+0xaa/0x150
[ 2425.126105]  [<000003bf4c72bafe>] __do_sys_wait4+0x7e/0x90
[ 2425.126110]  [<000003bf4d6173d0>] __do_syscall+0x150/0x590
[ 2425.126114]  [<000003bf4d6264b2>] system_call+0x72/0x90
[ 2425.126119] Last Breaking-Event-Address:
[ 2425.126121]  [<000003bf4c764d36>] prepare_task_switch+0x36/0x230
[ 2425.126127] ---[ end trace 0000000000000000 ]---

[ 1569.523962] ------------[ cut here ]------------
[ 1569.523974] WARNING: kernel/sched/core.c:6388 at pick_next_task+0xb82/0xbf0, CPU#7: swapper/7/0
[ 1569.523986] Modules linked in: mptcp_diag xfrm_user xfrm_algo tcp_diag crypto_user inet_diag netlink_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ip
f_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables s390_trng vfio_ccw mdev ea
dm_sch vfio_iommu_type1 vfio sch_fq_codel drm i2c_core drm_panel_orientation_quirks dm_service_time uvdevice diag288_wdt prng aes_s390 dm_mirror dm_region_hash dm_log zfcp scsi_transpo
c pkey_cca pkey_ep11 zcrypt paes_s390 phmac_s390 rng_core pkey_pckmo scsi_dh_alua pkey scsi_dh_emc crypto_engine scsi_dh_rdac dm_mul
tipath autofs4 ecdsa_generic ecc sha512
[ 1569.524037] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Tainted: G        W           7.1.0-20260607.rc6.git0.6e845bcb78c9.300.fc44.s390x+next #1 PREEMPTLAZY
[ 1569.524042] Tainted: [W]=WARN
[ 1569.524043] Hardware name: IBM 3931 A01 701 (LPAR)
[ 1569.524045] Krnl PSW : 0404c00180000000 0000027673df72a6 (pick_next_task+0xb86/0xbf0)
[ 1569.524050]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 1569.524053] Krnl GPRS: 00000000000002ed 0000027600000190 0000000000000190 0000027675e4abb0
[ 1569.524056]            0000000000000000 00000004e04c1a00 0000027600000002 00000004e04c1a00
[ 1569.524058]            0000000000000000 00000004e04e3a00 00000004e04e3a00 0000000000000190
[ 1569.524060]            0000000000000000 00000004000002ed 0000027673df6cd2 000001f673cc3c98
[ 1569.524069] Krnl Code: 0000027673df7298: ec56fcb6ff7e        cij     %r5,-1,6,0000027673df6c04
[ 1569.524069]            0000027673df729e: a7f4ff84            brc     15,0000027673df71a6
[ 1569.524069]           *0000027673df72a2: af000000            mc      0,0
[ 1569.524069]           >0000027673df72a6: a7f4fdef            brc     15,0000027673df6e84
[ 1569.524069]            0000027673df72aa: b9040087            lgr     %r8,%r7
[ 1569.524069]            0000027673df72ae: a7f4fb3c            brc     15,0000027673df6926
[ 1569.524069]            0000027673df72b2: af000000            mc      0,0
[ 1569.524069]            0000027673df72b6: a7f4ff2c            brc     15,0000027673df710e
[ 1569.524200] Call Trace:
[ 1569.524202]  [<0000027673df72a6>] pick_next_task+0xb86/0xbf0
[ 1569.524206] ([<0000027673df6b5e>] pick_next_task+0x43e/0xbf0)
[ 1569.524210]  [<0000027674cab926>] __schedule+0x1c6/0x8d0
[ 1569.524214]  [<0000027674cac1a6>] schedule_idle+0x36/0x60
[ 1569.524216]  [<0000027673e22f04>] do_idle+0x124/0x1a0
[ 1569.524221]  [<0000027673e23170>] cpu_startup_entry+0x40/0x50
[ 1569.524225]  [<0000027673d76994>] smp_start_secondary+0x144/0x160
[ 1569.524230]  [<0000027674cb688a>] restart_int_handler+0x72/0x88
[ 1569.524235] Last Breaking-Event-Address:
[ 1569.524236]  [<0000027673df6e80>] pick_next_task+0x760/0xbf0
[ 1569.524240] ---[ end trace 0000000000000000 ]---

[ 3533.502832] ------------[ cut here ]------------
[ 3533.502837] WARNING: kernel/sched/fair.c:7656 at hrtick_start_fair+0x6e/0x80, CPU#20: strace/2964820
[ 3533.502847] Modules linked in: mptcp_diag xfrm_user xfrm_algo tcp_diag crypto_user inet_diag netlink_diag tls quota_v2 quota_tree tun overlay nls_iso8859_1 nls_cp437 exfat vfat fat
 sctp udp_tunnel ip6_udp_tunnel algif_hash af_alg smc_diag smc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib ib_uverbs_support ib_core mlx5_vdpa vdpa vringh vhost_iotlb nf_tables mlx5_core s390_trng vfio_ccw mdev vfio_iommu_type1 eadm_
vfio sch_fq_codel drm i2c_core drm_panel_orientation_quirks dm_service_time uvdevice diag288_wdt hmac_s390 prng aes_s390 dm_mirror dm_region_hash dm_log zfcp scsi_transport_fc pkey_cca
y_ep11 zcrypt paes_s390 phmac_s390 rng_core pkey_pckmo pkey scsi_dh_alua scsi_dh_rdac scsi_dh_emc crypto_engine dm_multipath autofs4 ecdsa_generic ecc sha512 [last unloaded: trace_prin

[ 3533.502908] CPU: 20 UID: 1001 PID: 2964820 Comm: strace Tainted: G        W           7.1.0-20260607.rc6.git0.6e845bcb78c9.300.fc44.s390x+next #1 PREEMPTLAZY
[ 3533.502912] Tainted: [W]=WARN
[ 3533.502914] Hardware name: IBM 9175 ME1 701 (LPAR)
[ 3533.502915] Krnl PSW : 0404e00180000000 0000033a5a572be2 (hrtick_start_fair+0x72/0x80)
[ 3533.502921]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 3533.502924] Krnl GPRS: 00000000000000ff 00000003e4c9fa00 00000003e4c7da00 000000012cb72400
[ 3533.502926]            00000000000000a8 0000033a5c0d5038 00000003e4c7e9b0 00000003e4c7da00
[ 3533.502928]            000000012cb72400 0000033a5c0d6090 0000000300000014 000000012cb72548
[ 3533.502930]            0000000000000000 00000c0700000001 0000033a5a57f640 000002ba61a33a98
[ 3533.502938] Krnl Code: 0000033a5a572bd4: 47000700                bc      0,1792
                          0000033a5a572bd8: c0f4ffff7c64    brcl    15,0000033a5a5624a0
                         *0000033a5a572bde: af000000                mc      0,0
                         >0000033a5a572be2: a7f4ffdc                brc     15,0000033a5a572b9a
                          0000033a5a572be6: 0707            bcr     0,%r7
                          0000033a5a572be8: 0707            bcr     0,%r7
                          0000033a5a572bea: 0707            bcr     0,%r7
                          0000033a5a572bec: 0707            bcr     0,%r7
[ 3533.502985] Call Trace:
[ 3533.502988]  [<0000033a5a572be2>] hrtick_start_fair+0x72/0x80
[ 3533.502991] ([<0000033a5a57f4ae>] set_next_task_fair+0x12e/0x5d0)
[ 3533.502995]  [<0000033a5a56284c>] pick_next_task+0x12c/0xbf0
[ 3533.503001]  [<0000033a5b417926>] __schedule+0x1c6/0x8d0
[ 3533.503004]  [<0000033a5b41806c>] schedule+0x3c/0xc0
[ 3533.503006]  [<0000033a5a52740c>] do_wait+0x6c/0x190
[ 3533.503012]  [<0000033a5a5279da>] kernel_wait4+0xaa/0x150
[ 3533.503014]  [<0000033a5a527afe>] __do_sys_wait4+0x7e/0x90
[ 3533.503017]  [<0000033a5b4133cc>] __do_syscall+0x14c/0x590
[ 3533.503022]  [<0000033a5b4224b2>] system_call+0x72/0x90
[ 3533.503025] Last Breaking-Event-Address:
[ 3533.503026]  [<0000033a5a572b94>] hrtick_start_fair+0x24/0x80
[ 3533.503030] ---[ end trace 0000000000000000 ]---

Thanks!

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: linux-next crashes in scheduler on s390
  2026-06-10 12:47 linux-next crashes in scheduler on s390 Alexander Gordeev
@ 2026-06-10 13:03 ` Sven Schnelle
  0 siblings, 0 replies; 2+ messages in thread
From: Sven Schnelle @ 2026-06-10 13:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Gordeev, Ingo Molnar, Juri Lelli, Vincent Guittot,
	K Prateek Nayak, Mark Brown, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Valentin Schneider, John Stultz,
	Vineeth Pillai, Joel Fernandes, Heiko Carstens, Vasily Gorbik,
	linux-s390, linux-kernel, linux-next, Aaron Lu

Alexander Gordeev <agordeev@linux.ibm.com> writes:

> Hi All,
>
> Since about June 1st we're getting strace test suite (make -j$(nproc) check)
> crashes on s390 in linux-next. Those are pretty easy to reproduce, but I
> have not been able to nail it down to the particular commit/merge.
>
> I am going to bisect it, but since we are approaching v7.1 release, any
> hint would be greatly appreciated!
> [..]

I bisected it to
https://lore.kernel.org/all/20260511120627.944705718@infradead.org/
("[PATCH v2 08/10] sched/fair: Add newidle balance to pick_task_fair()")

Adding the patch proposed in
https://lore.kernel.org/all/20260603095108.GA1684319@bytedance.com/
fixes the issue for me.

To reproduce, running the strace test suite seems enough. If required, I
can try to figure out the exact test that crashes the kernel.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-10 13:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 12:47 linux-next crashes in scheduler on s390 Alexander Gordeev
2026-06-10 13:03 ` Sven Schnelle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox