From: Juri Lelli <juri.lelli@redhat.com>
To: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Clark Williams <williams@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>
Subject: [RT BUG] isolcpus causes sleeping function called from invalid context (4.19.59-rt24)
Date: Mon, 5 Aug 2019 12:06:46 +0200 [thread overview]
Message-ID: <20190805100646.GH14724@localhost.localdomain> (raw)
Hi,
Booting 4.19.59-rt24 with debug options enabled (DEBUG_ATOMIC_SLEEP) I
noticed the following splat (edited for clarity):
--->8---
Linux version 4.19.59-rt24 (...) (...) #2 SMP PREEMPT RT Mon Aug 5 05:23:26 EDT 2019
Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.19.59-rt24 ... skew_tick=1 isolcpus=9-35 intel_pstate=disable nosoftlockup nohz=on nohz_full=9-35 rcu_nocbs=9-35
...
smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (family: 0x6, model: 0x3f, stepping: 0x2)
...
smp: Bringing up secondary CPUs ...
x86: Booting SMP configuration:
.... node #0, CPUs: #1
#2
#3
#4
#5
#6
#7
#8
.... node #1, CPUs: #9
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:967
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/9
1 lock held by swapper/9/0:
#0: 00000000eb141bd9 ((pendingb_lock).lock){+.+.}, at: queue_delayed_work_on+0x103/0x580
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>]
hardirqs last disabled at (0): [<ffffffffadda7bf8>] copy_process.part.18+0x1928/0x7250
softirqs last enabled at (0): [<ffffffffadda7c94>] copy_process.part.18+0x19c4/0x7250
softirqs last disabled at (0): [<0000000000000000>] (null)
Preemption disabled at:
[<ffffffffadcbdaef>] start_secondary+0x11f/0x530
CPU: 9 PID: 0 Comm: swapper/9 Not tainted 4.19.59-rt24 #2
Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/17/2018
Call Trace:
dump_stack+0x9a/0xf0
___might_sleep.cold.57+0x6a/0x7c
rt_spin_lock+0x75/0x90
? queue_delayed_work_on+0x103/0x580
queue_delayed_work_on+0x103/0x580
sched_cpu_starting+0x28c/0x370
? sched_cpu_deactivate+0x2b0/0x2b0
cpuhp_invoke_callback+0x1d0/0x1dc0
? _raw_spin_unlock_irqrestore+0x4a/0xf0
notify_cpu_starting+0x117/0x190
start_secondary+0x2be/0x530
? set_c+0x27c0/0x27c0
secondary_startup_64+0xb6/0xc0
#10
#11
#12
#13
#14
#15
#16
#17
.... node #0, CPUs: #18
#19
#20
#21
#22
#23
#24
#25
#26
.... node #1, CPUs: #27
#28
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:967
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/28
1 lock held by swapper/28/0:
#0: 0000000004707c3e ((pendingb_lock).lock){+.+.}, at: queue_delayed_work_on+0x103/0x580
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] (null)
hardirqs last disabled at (0): [<ffffffffadda7bf8>] copy_process.part.18+0x1928/0x7250
softirqs last enabled at (0): [<ffffffffadda7c94>] copy_process.part.18+0x19c4/0x0
softirqs last disabled at (0): [<0000000000000000>] (null)
Preemption disabled at:
[<ffffffffadcbdaef>] start_secondary+0x11f/0x530
CPU: 28 PID: 0 Comm: swapper/28 Tainted: G W 4.19.59-rt24 #2
Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/17/2018
Call Trace:
dump_stack+0x9a/0xf0
___might_sleep.cold.57+0x6a/0x7c
rt_spin_lock+0x75/0x90
? queue_delayed_work_on+0x103/0x580
queue_delayed_work_on+0x103/0x580
sched_cpu_starting+0x28c/0x370
? sched_cpu_deactivate+0x2b0/0x2b0
cpuhp_invoke_callback+0x1d0/0x1dc0
? _raw_spin_unlock_irqrestore+0x4a/0xf0
notify_cpu_starting+0x117/0x190
start_secondary+0x2be/0x530
? set_cpu_sibling_map+0x27c0/0x27c0
secondary_startup_64+0xb6/0xc0
#29
#30
#31
#32
#33
#35
smp: Brought up 2 nodes, 36 CPUs
smpboot: Max logical packages: 2
smpboot: Total of 36 processors activated (166451.61 BogoMIPS)
--->8---
This only happens if isolcpus are configured at boot.
AFAIU, RT is reworking workqueues and 5.x-rt shouldn't suffer from this.
As a matter of fact, I could verify that backporting the workqueue
rework all-in change from 5.0-rt [1] fixes this problem.
I'm thus wondering if there is any plan on backporting the rework to
4.19-rt stable, and if that patch has dependencies, or if any alternative
fix might be found for this problem.
Thanks,
Juri
1 - https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?h=v5.0.19-rt11&id=d15a862f24df983458533aebd6fa207ecdd1095a
next reply other threads:[~2019-08-05 10:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-05 10:06 Juri Lelli [this message]
2019-08-07 20:07 ` [RT BUG] isolcpus causes sleeping function called from invalid context (4.19.59-rt24) Steven Rostedt
2019-08-08 6:47 ` Juri Lelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190805100646.GH14724@localhost.localdomain \
--to=juri.lelli@redhat.com \
--cc=bigeasy@linutronix.de \
--cc=bristot@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).