All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juri Lelli <juri.lelli@redhat.com>
To: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Clark Williams <williams@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: [RT BUG] isolcpus causes sleeping function called from invalid context (4.19.59-rt24)
Date: Mon, 5 Aug 2019 12:06:46 +0200	[thread overview]
Message-ID: <20190805100646.GH14724@localhost.localdomain> (raw)

Hi,

Booting 4.19.59-rt24 with debug options enabled (DEBUG_ATOMIC_SLEEP) I
noticed the following splat (edited for clarity):

--->8---
 Linux version 4.19.59-rt24 (...) (...) #2 SMP PREEMPT RT Mon Aug 5 05:23:26 EDT 2019
 Command line: BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.19.59-rt24 ... skew_tick=1 isolcpus=9-35 intel_pstate=disable nosoftlockup nohz=on nohz_full=9-35 rcu_nocbs=9-35
 ...
 smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz (family: 0x6, model: 0x3f, stepping: 0x2)
 ...
 smp: Bringing up secondary CPUs ...
 x86: Booting SMP configuration:
 .... node  #0, CPUs:        #1
   #2
   #3
   #4
   #5
   #6
   #7
   #8
 .... node  #1, CPUs:    #9
 BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:967
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/9
 1 lock held by swapper/9/0:
  #0: 00000000eb141bd9 ((pendingb_lock).lock){+.+.}, at: queue_delayed_work_on+0x103/0x580
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>]
 hardirqs last disabled at (0): [<ffffffffadda7bf8>] copy_process.part.18+0x1928/0x7250
 softirqs last  enabled at (0): [<ffffffffadda7c94>] copy_process.part.18+0x19c4/0x7250
 softirqs last disabled at (0): [<0000000000000000>]           (null)
 Preemption disabled at:
 [<ffffffffadcbdaef>] start_secondary+0x11f/0x530
 CPU: 9 PID: 0 Comm: swapper/9 Not tainted 4.19.59-rt24 #2
 Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/17/2018
 Call Trace:
  dump_stack+0x9a/0xf0
  ___might_sleep.cold.57+0x6a/0x7c
  rt_spin_lock+0x75/0x90
  ? queue_delayed_work_on+0x103/0x580
  queue_delayed_work_on+0x103/0x580
  sched_cpu_starting+0x28c/0x370
  ? sched_cpu_deactivate+0x2b0/0x2b0
  cpuhp_invoke_callback+0x1d0/0x1dc0
  ? _raw_spin_unlock_irqrestore+0x4a/0xf0
  notify_cpu_starting+0x117/0x190
  start_secondary+0x2be/0x530
  ? set_c+0x27c0/0x27c0
  secondary_startup_64+0xb6/0xc0
  #10
  #11
  #12
  #13
  #14
  #15
  #16
  #17
 .... node  #0, CPUs:   #18
  #19
  #20
  #21
  #22
  #23
  #24
  #25
  #26
 .... node  #1, CPUs:   #27
  #28
 BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:967
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/28
 1 lock held by swapper/28/0:
  #0: 0000000004707c3e ((pendingb_lock).lock){+.+.}, at: queue_delayed_work_on+0x103/0x580
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>]           (null)
 hardirqs last disabled at (0): [<ffffffffadda7bf8>] copy_process.part.18+0x1928/0x7250
 softirqs last  enabled at (0): [<ffffffffadda7c94>] copy_process.part.18+0x19c4/0x0
 softirqs last disabled at (0): [<0000000000000000>]           (null)
 Preemption disabled at:
 [<ffffffffadcbdaef>] start_secondary+0x11f/0x530
 CPU: 28 PID: 0 Comm: swapper/28 Tainted: G        W         4.19.59-rt24 #2
 Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/17/2018
 Call Trace:
  dump_stack+0x9a/0xf0
  ___might_sleep.cold.57+0x6a/0x7c
  rt_spin_lock+0x75/0x90
  ? queue_delayed_work_on+0x103/0x580
  queue_delayed_work_on+0x103/0x580
  sched_cpu_starting+0x28c/0x370
  ? sched_cpu_deactivate+0x2b0/0x2b0
  cpuhp_invoke_callback+0x1d0/0x1dc0
  ? _raw_spin_unlock_irqrestore+0x4a/0xf0
  notify_cpu_starting+0x117/0x190
  start_secondary+0x2be/0x530
  ? set_cpu_sibling_map+0x27c0/0x27c0
  secondary_startup_64+0xb6/0xc0
  #29
  #30
  #31
  #32
  #33

  #35
 smp: Brought up 2 nodes, 36 CPUs
 smpboot: Max logical packages: 2
 smpboot: Total of 36 processors activated (166451.61 BogoMIPS)
--->8---

This only happens if isolcpus are configured at boot.

AFAIU, RT is reworking workqueues and 5.x-rt shouldn't suffer from this.
As a matter of fact, I could verify that backporting the workqueue
rework all-in change from 5.0-rt [1] fixes this problem.

I'm thus wondering if there is any plan on backporting the rework to
4.19-rt stable, and if that patch has dependencies, or if any alternative
fix might be found for this problem.

Thanks,

Juri

1 - https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/commit/?h=v5.0.19-rt11&id=d15a862f24df983458533aebd6fa207ecdd1095a

             reply	other threads:[~2019-08-05 10:06 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-05 10:06 Juri Lelli [this message]
2019-08-07 20:07 ` [RT BUG] isolcpus causes sleeping function called from invalid context (4.19.59-rt24) Steven Rostedt
2019-08-08  6:47   ` Juri Lelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190805100646.GH14724@localhost.localdomain \
    --to=juri.lelli@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.