From: Vasily Gorbik <gor@linux.ibm.com>
To: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
John Stultz <jstultz@google.com>,
"Connor O'Brien" <connoro@google.com>,
Vineeth Pillai <vineethrp@google.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 0/1] sched/core: Don't steal a proxy-exec donor
Date: Mon, 4 May 2026 14:31:52 +0200 [thread overview]
Message-ID: <c00-01.ttedd70@ub.hpns> (raw)
Since sched-core-2026-04-13 pull s390 hits the following splats with
defconfig. Running strace test suite "make -j$(nproc) check" on LPAR with
64 SMT-2 cores couple of times usually enough to trigger this. First
WARN in put_prev_entity() on a strace task. The next pick on the same
CPU typically WARNs again, and 60s later the system is in an rcu_sched stall
[ 535.525203] WARNING: kernel/sched/fair.c:5788 at put_prev_entity+0xfe/0x170, CPU#26: grep/242219
[ 535.525212] Modules linked in: mptcp_diag xfrm_user xfrm_algo tcp_diag crypto_user inet_diag netlink_diag algif_hash af_alg dm_service_time nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables zfcp scsi_transport_fc s390_trng eadm_sch vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel drm i2c_core loop dm_multipath drm_panel_orientation_quirks nfnetlink uvdevice diag288_wdt prng aes_s390 scsi_dh_rdac scsi_dh_emc scsi_dh_alua phmac_s390 paes_s390 crypto_engine pkey_cca pkey_ep11 zcrypt rng_core pkey_pckmo pkey autofs4 ecdsa_generic ecc sha512 [last unloaded: trace_printk]
[ 535.525268] CPU: 26 UID: 1001 PID: 242219 Comm: grep Not tainted 7.1.0-20260426.rc0.git0.897d54018cc9.300.fc43.s390x+git #1 PREEMPTLAZY
[ 535.525272] Hardware name: IBM 8561 T01 703 (LPAR)
[ 535.525273] Krnl PSW : 0404e00180000000 0000033840f08482 (put_prev_entity+0x102/0x170)
[ 535.525279] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 535.525282] Krnl GPRS: 000000000000b8ee 0000000000000000 0000000000000000 0000000000000800
[ 535.525284] 0000000000000000 0000007cafc9c400 0000033840f084f0 00000003e5512b68
[ 535.525287] 00000003e5512b70 0000007cafc9c516 00000000dcd88400 00000001d3b8c900
[ 535.525289] 0000000000000000 00000001d3b8c800 0000033840f08474 000002b84839fbe0
[ 535.525297] Krnl Code: 0000033840f08474: e3b0a0580020 cg %r11,88(%r10)
0000033840f0847a: a784ffa5 brc 8,0000033840f083c4
*0000033840f0847e: af000000 mc 0,0
>0000033840f08482: e548a0580000 mvghi 88(%r10),0
0000033840f08488: eb9ff0a00004 lmg %r9,%r15,160(%r15)
0000033840f0848e: 07fe bcr 15,%r14
0000033840f08490: 47000700 bc 0,1792
0000033840f08494: c0e5ffffeb86 brasl %r14,0000033840f05ba0
[ 535.525367] Call Trace:
[ 535.525372] [<0000033840f08482>] put_prev_entity+0x102/0x170
[ 535.525377] ([<0000033840f08474>] put_prev_entity+0xf4/0x170)
[ 535.525381] [<0000033840f0852a>] put_prev_task_fair+0x3a/0x60
[ 535.525385] [<0000033840ef6e58>] pick_next_task+0x138/0xbd0
[ 535.525388] [<0000033841d364d0>] __schedule+0x180/0x850
[ 535.525391] [<0000033841d36bdc>] schedule+0x3c/0xc0
[ 535.525394] [<0000033841d32d10>] irqentry_exit+0x1c0/0x610
[ 535.525397] [<0000033841d32846>] do_ext_irq+0xe6/0x290
[ 535.525399] [<0000033841d41576>] ext_int_handler+0xc6/0xf0
[ 535.525402] Last Breaking-Event-Address:
[ 535.525403] [<0000033840eff36a>] propagate_entity_load_avg+0x3a/0x490
[ 535.525407] ---[ end trace 0000000000000000 ]---
[ 535.525422] ------------[ cut here ]------------
[ 535.525424] WARNING: kernel/sched/fair.c:7022 at hrtick_start_fair+0x6c/0x80, CPU#26: swapper/26/0
[ 535.525428] Modules linked in: mptcp_diag xfrm_user xfrm_algo tcp_diag crypto_user inet_diag netlink_diag algif_hash af_alg dm_service_time nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables zfcp scsi_transport_fc s390_trng eadm_sch vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel drm i2c_core loop dm_multipath drm_panel_orientation_quirks nfnetlink uvdevice diag288_wdt prng aes_s390 scsi_dh_rdac scsi_dh_emc scsi_dh_alua phmac_s390 paes_s390 crypto_engine pkey_cca pkey_ep11 zcrypt rng_core pkey_pckmo pkey autofs4 ecdsa_generic ecc sha512 [last unloaded: trace_printk]
[ 535.525465] CPU: 26 UID: 0 PID: 0 Comm: swapper/26 Tainted: G W 7.1.0-20260426.rc0.git0.897d54018cc9.300.fc43.s390x+git #1 PREEMPTLAZY
[ 535.525469] Tainted: [W]=WARN
[ 535.525471] Hardware name: IBM 8561 T01 703 (LPAR)
[ 535.525472] Krnl PSW : 0404e00180000000 0000033840ef9ac0 (hrtick_start_fair+0x70/0x80)
[ 535.525476] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 535.525479] Krnl GPRS: 0000007c00000001 00000003e5533800 00000003e5511800 00000001d3b8c800
[ 535.525482] 00000000000000d8 00000338429cd038 0000033840f19860 00000003e5511800
[ 535.525484] 00000001d3b8c800 000000000000003f 00000003e55127b0 00000001d3b8c948
[ 535.525486] 0000000000000000 0000000081992400 0000033840f0876a 000002b840e5bc48
[ 535.525492] Krnl Code: 0000033840ef9ab2: 47000700 bc 0,1792
0000033840ef9ab6: c0f4ffffd4dd brcl 15,0000033840ef4470
*0000033840ef9abc: af000000 mc 0,0
>0000033840ef9ac0: a7f4ffdc brc 15,0000033840ef9a78
0000033840ef9ac4: 0707 bcr 0,%r7
0000033840ef9ac6: 0707 bcr 0,%r7
0000033840ef9ac8: 0707 bcr 0,%r7
0000033840ef9aca: 0707 bcr 0,%r7
[ 535.525505] Call Trace:
[ 535.525507] [<0000033840ef9ac0>] hrtick_start_fair+0x70/0x80
[ 535.525509] ([<0000033840f086fe>] set_next_task_fair+0x4e/0x230)
[ 535.525513] [<0000033840ef6e76>] pick_next_task+0x156/0xbd0
[ 535.525515] [<0000033841d364d0>] __schedule+0x180/0x850
[ 535.525518] [<0000033841d36d16>] schedule_idle+0x36/0x60
[ 535.525520] [<0000033840f141ce>] do_idle+0x11e/0x160
[ 535.525523] [<0000033840f143e0>] cpu_startup_entry+0x40/0x50
[ 535.525526] [<0000033840e6f7c8>] smp_start_secondary+0x138/0x150
[ 535.525529] [<0000033841d40f72>] restart_int_handler+0x72/0x88
[ 535.525532] Last Breaking-Event-Address:
[ 535.525533] [<0000033840ef9a72>] hrtick_start_fair+0x22/0x80
[ 535.525536] ---[ end trace 0000000000000000 ]---
[ 535.525549] ------------[ cut here ]------------
[ 535.525550] WARNING: kernel/sched/sched.h:1769 at set_next_entity+0x35a/0x370, CPU#26: swapper/26/0
[ 535.525555] Modules linked in: mptcp_diag xfrm_user xfrm_algo tcp_diag crypto_user inet_diag netlink_diag algif_hash af_alg dm_service_time nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables zfcp scsi_transport_fc s390_trng eadm_sch vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel drm i2c_core loop dm_multipath drm_panel_orientation_quirks nfnetlink uvdevice diag288_wdt prng aes_s390 scsi_dh_rdac scsi_dh_emc scsi_dh_alua phmac_s390 paes_s390 crypto_engine pkey_cca pkey_ep11 zcrypt rng_core pkey_pckmo pkey autofs4 ecdsa_generic ecc sha512 [last unloaded: trace_printk]
[ 535.525588] CPU: 26 UID: 0 PID: 0 Comm: swapper/26 Tainted: G W 7.1.0-20260426.rc0.git0.897d54018cc9.300.fc43.s390x+git #1 PREEMPTLAZY
[ 535.525591] Tainted: [W]=WARN
[ 535.525593] Hardware name: IBM 8561 T01 703 (LPAR)
[ 535.525594] Krnl PSW : 0404d00180000000 0000033840f0395e (set_next_entity+0x35e/0x370)
[ 535.525598] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[ 535.525601] Krnl GPRS: 0000007caeee0479 07cafcf56a100000 0000000000000000 0000000000000000
[ 535.525604] 00000003e5445800 0000000000e15d28 0000033840f19860 00000003e5511800
[ 535.525606] 0000000000000001 0000000000000001 000000026ad0f400 00000001d3b8c900
[ 535.525608] 0000000000000000 0000000081992400 0000033840f036f6 000002b840e5bc00
[ 535.525613] Krnl Code: 0000033840f03952: b9e93090 sgrk %r9,%r0,%r3
0000033840f03956: a7f4fee7 brc 15,0000033840f03724
*0000033840f0395a: af000000 mc 0,0
>0000033840f0395e: a7f4fedb brc 15,0000033840f03714
0000033840f03962: 0707 bcr 0,%r7
0000033840f03964: 0707 bcr 0,%r7
0000033840f03966: 0707 bcr 0,%r7
0000033840f03968: 0707 bcr 0,%r7
[ 535.525626] Call Trace:
[ 535.525627] [<0000033840f0395e>] set_next_entity+0x35e/0x370
[ 535.525630] ([<0000033840f036f6>] set_next_entity+0xf6/0x370)
[ 535.525633] [<0000033840f086fe>] set_next_task_fair+0x4e/0x230
[ 535.525636] [<0000033840ef6e76>] pick_next_task+0x156/0xbd0
[ 535.525639] [<0000033841d364d0>] __schedule+0x180/0x850
[ 535.525642] [<0000033841d36d16>] schedule_idle+0x36/0x60
[ 535.525645] [<0000033840f141ce>] do_idle+0x11e/0x160
[ 535.525647] [<0000033840f143e0>] cpu_startup_entry+0x40/0x50
[ 535.525650] [<0000033840e6f7c8>] smp_start_secondary+0x138/0x150
[ 535.525652] [<0000033841d40f72>] restart_int_handler+0x72/0x88
[ 535.525655] Last Breaking-Event-Address:
[ 535.525656] [<0000033840f03710>] set_next_entity+0x110/0x370
[ 535.525659] ---[ end trace 0000000000000000 ]---
[ 535.528235] ------------[ cut here ]------------
[ 535.528241] WARNING: kernel/sched/fair.c:5721 at set_next_entity+0x2c8/0x370, CPU#20: swapper/20/0
[ 535.528248] Modules linked in: mptcp_diag xfrm_user xfrm_algo tcp_diag crypto_user inet_diag netlink_diag algif_hash af_alg dm_service_time nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables zfcp scsi_transport_fc s390_trng eadm_sch vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel drm i2c_core loop dm_multipath drm_panel_orientation_quirks nfnetlink uvdevice diag288_wdt prng aes_s390 scsi_dh_rdac scsi_dh_emc scsi_dh_alua phmac_s390 paes_s390 crypto_engine pkey_cca pkey_ep11 zcrypt rng_core pkey_pckmo pkey autofs4 ecdsa_generic ecc sha512 [last unloaded: trace_printk]
[ 535.528291] CPU: 20 UID: 0 PID: 0 Comm: swapper/20 Tainted: G W 7.1.0-20260426.rc0.git0.897d54018cc9.300.fc43.s390x+git #1 PREEMPTLAZY
[ 535.528295] Tainted: [W]=WARN
[ 535.528296] Hardware name: IBM 8561 T01 703 (LPAR)
[ 535.528297] Krnl PSW : 0404e00180000000 0000033840f038cc (set_next_entity+0x2cc/0x370)
[ 535.528302] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[ 535.528305] Krnl GPRS: 0000007caff854b2 00000001d3b8c900 0000000082cec900 00000003e5445800
[ 535.528308] 000000000000002e 00000000000f423f 0000033840f19860 00000003e5445800
[ 535.528310] 00000001d3b8c900 000000804f18b648 000000026ad0f400 00000001d3b8c900
[ 535.528312] 0000000000000000 000000008198c800 0000033840f03820 000002b840e2bc00
[ 535.528319] Krnl Code: 0000033840f038be: e310a0580002 ltg %r1,88(%r10)
0000033840f038c4: a784fece brc 8,0000033840f03660
*0000033840f038c8: af000000 mc 0,0
>0000033840f038cc: a7f4feca brc 15,0000033840f03660
0000033840f038d0: e310b4000002 ltg %r1,1024(%r11)
0000033840f038d6: a784ff09 brc 8,0000033840f036e8
0000033840f038da: 4140b400 la %r4,1024(%r11)
0000033840f038de: e330bf00ff71 lay %r3,-256(%r11)
[ 535.528334] Call Trace:
[ 535.528336] [<0000033840f038cc>] set_next_entity+0x2cc/0x370
[ 535.528339] ([<0000033840f037ac>] set_next_entity+0x1ac/0x370)
[ 535.528342] [<0000033840f086fe>] set_next_task_fair+0x4e/0x230
[ 535.528345] [<0000033840ef6e76>] pick_next_task+0x156/0xbd0
[ 535.528348] [<0000033841d364d0>] __schedule+0x180/0x850
[ 535.528351] [<0000033841d36d16>] schedule_idle+0x36/0x60
[ 535.528354] [<0000033840f141ce>] do_idle+0x11e/0x160
[ 535.528356] [<0000033840f143e0>] cpu_startup_entry+0x40/0x50
[ 535.528359] [<0000033840e6f7c8>] smp_start_secondary+0x138/0x150
[ 535.528362] [<0000033841d40f72>] restart_int_handler+0x72/0x88
[ 535.528365] Last Breaking-Event-Address:
[ 535.528366] [<0000033840f0365c>] set_next_entity+0x5c/0x370
[ 535.528369] ---[ end trace 0000000000000000 ]---
[ 595.527130] rcu: INFO: rcu_sched self-detected stall on CPU
[ 595.527142] rcu: 7-...!: (47526 ticks this GP) idle=8f04/1/0x4000000000000000 softirq=56080/56080 fqs=0
[ 595.527175] rcu: (t=60000 jiffies g=232693 q=6273 ncpus=32)
[ 595.527177] rcu: rcu_sched kthread timer wakeup didn't happen for 59996 jiffies! g232693 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 595.527180] rcu: Possible timer handling issue on cpu=28 timer-softirq=1957
[ 595.527182] rcu: rcu_sched kthread starved for 60000 jiffies! g232693 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=28
[ 595.527184] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 595.527186] rcu: RCU grace-period kthread stack dump:
Bisect pointed at commit e0ca8991b2de6 ("sched: Make class_schedulers
avoid pushing current, and get rid of proxy_tag_curr()"), of the
sched-core-2026-04-13 pull. But it only seems to make the corruption
trigger easier, reverting it doesn't help. Applying "Proxy Execution
fixes for v7.1-rc" [1] on top didn't help, nor did "Optimized Donor
Migration for Proxy Execution" [2].
The problem goes away when sched_proxy_exec=0 or nosmt is used.
Adding some debug code in deactivate_task() showed try_steal_cookie()
calling it on src->donor right before the warning. try_steal_cookie()
skips src->core_pick and src->curr but not src->donor.
[1] https://lore.kernel.org/all/20260427183848.698551-1-jstultz@google.com/
[2] https://lore.kernel.org/all/20260422230659.903191-1-jstultz@google.com/
The following patch resolves the issue in my tests. Please consider it
if it makes sense.
Vasily Gorbik (1):
sched/core: Don't steal a proxy-exec donor
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--
2.53.0
next reply other threads:[~2026-05-04 12:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-04 12:31 Vasily Gorbik [this message]
2026-05-04 12:31 ` [PATCH 1/1] sched/core: Don't steal a proxy-exec donor Vasily Gorbik
2026-05-04 13:19 ` K Prateek Nayak
2026-05-05 10:03 ` Vasily Gorbik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c00-01.ttedd70@ub.hpns \
--to=gor@linux.ibm.com \
--cc=bsegall@google.com \
--cc=connoro@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=joelagnelf@nvidia.com \
--cc=jstultz@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vineethrp@google.com \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.