All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Qian Cai <cai@lca.pw>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: Endless soft-lockups for compiling workload since next-20200519
Date: Wed, 20 May 2020 14:50:56 +0200	[thread overview]
Message-ID: <20200520125056.GC325280@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAG=TAF6jUsQrW-fjbS3vpjkMfn8=MUDsuQxjk3NMfvQa250RHA@mail.gmail.com>

On Tue, May 19, 2020 at 11:58:17PM -0400, Qian Cai wrote:
> Just a head up. Repeatedly compiling kernels for a while would trigger
> endless soft-lockups since next-20200519 on both x86_64 and powerpc.
> .config are in,

Could be 90b5363acd47 ("sched: Clean up scheduler_ipi()"), although I've
not seen anything like that myself. Let me go have a look.


In as far as the logs are readable (they're a wrapped mess, please don't
do that!), they contain very little useful, as is typical with IPIs :/

> [ 1167.993773][    C1] WARNING: CPU: 1 PID: 0 at kernel/smp.c:127
> flush_smp_call_function_queue+0x1fa/0x2e0
> [ 1168.003333][    C1] Modules linked in: nls_iso8859_1 nls_cp437 vfat
> fat kvm_amd ses kvm enclosure dax_pmem irqbypass dax_pmem_core efivars
> acpi_cpufreq efivarfs ip_tables x_tables xfs sd_mod smartpqi
> scsi_transport_sas tg3 mlx5_core libphy firmware_class dm_mirror
> dm_region_hash dm_log dm_mod
> [ 1168.029492][    C1] CPU: 1 PID: 0 Comm: swapper/1 Not tainted
> 5.7.0-rc6-next-20200519 #1
> [ 1168.037665][    C1] Hardware name: HPE ProLiant DL385
> Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> [ 1168.046978][    C1] RIP: 0010:flush_smp_call_function_queue+0x1fa/0x2e0
> [ 1168.053658][    C1] Code: 01 0f 87 c9 12 00 00 83 e3 01 0f 85 cc fe
> ff ff 48 c7 c7 c0 55 a9 8f c6 05 f6 86 cd 01 01 e8 de 09 ea ff 0f 0b
> e9 b2 fe ff ff <0f> 0b e9 52 ff ff ff 0f 0b e9 f2 fe ff ff 65 44 8b 25
> 10 52 3f 71
> [ 1168.073262][    C1] RSP: 0018:ffffc90000178918 EFLAGS: 00010046
> [ 1168.079253][    C1] RAX: 0000000000000000 RBX: ffff8888430c58f8
> RCX: ffffffff8ec26083
> [ 1168.087156][    C1] RDX: 0000000000000003 RSI: dffffc0000000000
> RDI: ffff8888430c58f8
> [ 1168.095054][    C1] RBP: ffffc900001789a8 R08: ffffed1108618cec
> R09: ffffed1108618cec
> [ 1168.102964][    C1] R10: ffff8888430c675b R11: 0000000000000000
> R12: ffff8888430c58e0
> [ 1168.110866][    C1] R13: ffffffff8eb30c40 R14: ffff8888430c5880
> R15: ffff8888430c58e0
> [ 1168.118767][    C1] FS:  0000000000000000(0000)
> GS:ffff888843080000(0000) knlGS:0000000000000000
> [ 1168.127628][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1168.134129][    C1] CR2: 000055b169604560 CR3: 0000000d08a14000
> CR4: 00000000003406e0
> [ 1168.142026][    C1] Call Trace:
> [ 1168.145206][    C1]  <IRQ>
> [ 1168.147957][    C1]  ? smp_call_on_cpu_callback+0xd0/0xd0
> [ 1168.153421][    C1]  ? rcu_read_lock_sched_held+0xac/0xe0
> [ 1168.158880][    C1]  ? rcu_read_lock_bh_held+0xc0/0xc0
> [ 1168.164076][    C1]  generic_smp_call_function_single_interrupt+0x13/0x2b
> [ 1168.170938][    C1]  smp_call_function_single_interrupt+0x157/0x4e0
> [ 1168.177278][    C1]  ? smp_call_function_interrupt+0x4e0/0x4e0
> [ 1168.183172][    C1]  ? interrupt_entry+0xe4/0xf0
> [ 1168.187846][    C1]  ? trace_hardirqs_off_caller+0x8d/0x1f0
> [ 1168.193478][    C1]  ? trace_hardirqs_on_caller+0x1f0/0x1f0
> [ 1168.199116][    C1]  ? _nohz_idle_balance+0x221/0x360
> [ 1168.204228][    C1]  ? trace_hardirqs_off_thunk+0x1a/0x1c
> [ 1168.209690][    C1]  call_function_single_interrupt+0xf/0x20

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Qian Cai <cai@lca.pw>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Frederic Weisbecker <frederic@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: Endless soft-lockups for compiling workload since next-20200519
Date: Wed, 20 May 2020 14:50:56 +0200	[thread overview]
Message-ID: <20200520125056.GC325280@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAG=TAF6jUsQrW-fjbS3vpjkMfn8=MUDsuQxjk3NMfvQa250RHA@mail.gmail.com>

On Tue, May 19, 2020 at 11:58:17PM -0400, Qian Cai wrote:
> Just a head up. Repeatedly compiling kernels for a while would trigger
> endless soft-lockups since next-20200519 on both x86_64 and powerpc.
> .config are in,

Could be 90b5363acd47 ("sched: Clean up scheduler_ipi()"), although I've
not seen anything like that myself. Let me go have a look.


In as far as the logs are readable (they're a wrapped mess, please don't
do that!), they contain very little useful, as is typical with IPIs :/

> [ 1167.993773][    C1] WARNING: CPU: 1 PID: 0 at kernel/smp.c:127
> flush_smp_call_function_queue+0x1fa/0x2e0
> [ 1168.003333][    C1] Modules linked in: nls_iso8859_1 nls_cp437 vfat
> fat kvm_amd ses kvm enclosure dax_pmem irqbypass dax_pmem_core efivars
> acpi_cpufreq efivarfs ip_tables x_tables xfs sd_mod smartpqi
> scsi_transport_sas tg3 mlx5_core libphy firmware_class dm_mirror
> dm_region_hash dm_log dm_mod
> [ 1168.029492][    C1] CPU: 1 PID: 0 Comm: swapper/1 Not tainted
> 5.7.0-rc6-next-20200519 #1
> [ 1168.037665][    C1] Hardware name: HPE ProLiant DL385
> Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
> [ 1168.046978][    C1] RIP: 0010:flush_smp_call_function_queue+0x1fa/0x2e0
> [ 1168.053658][    C1] Code: 01 0f 87 c9 12 00 00 83 e3 01 0f 85 cc fe
> ff ff 48 c7 c7 c0 55 a9 8f c6 05 f6 86 cd 01 01 e8 de 09 ea ff 0f 0b
> e9 b2 fe ff ff <0f> 0b e9 52 ff ff ff 0f 0b e9 f2 fe ff ff 65 44 8b 25
> 10 52 3f 71
> [ 1168.073262][    C1] RSP: 0018:ffffc90000178918 EFLAGS: 00010046
> [ 1168.079253][    C1] RAX: 0000000000000000 RBX: ffff8888430c58f8
> RCX: ffffffff8ec26083
> [ 1168.087156][    C1] RDX: 0000000000000003 RSI: dffffc0000000000
> RDI: ffff8888430c58f8
> [ 1168.095054][    C1] RBP: ffffc900001789a8 R08: ffffed1108618cec
> R09: ffffed1108618cec
> [ 1168.102964][    C1] R10: ffff8888430c675b R11: 0000000000000000
> R12: ffff8888430c58e0
> [ 1168.110866][    C1] R13: ffffffff8eb30c40 R14: ffff8888430c5880
> R15: ffff8888430c58e0
> [ 1168.118767][    C1] FS:  0000000000000000(0000)
> GS:ffff888843080000(0000) knlGS:0000000000000000
> [ 1168.127628][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1168.134129][    C1] CR2: 000055b169604560 CR3: 0000000d08a14000
> CR4: 00000000003406e0
> [ 1168.142026][    C1] Call Trace:
> [ 1168.145206][    C1]  <IRQ>
> [ 1168.147957][    C1]  ? smp_call_on_cpu_callback+0xd0/0xd0
> [ 1168.153421][    C1]  ? rcu_read_lock_sched_held+0xac/0xe0
> [ 1168.158880][    C1]  ? rcu_read_lock_bh_held+0xc0/0xc0
> [ 1168.164076][    C1]  generic_smp_call_function_single_interrupt+0x13/0x2b
> [ 1168.170938][    C1]  smp_call_function_single_interrupt+0x157/0x4e0
> [ 1168.177278][    C1]  ? smp_call_function_interrupt+0x4e0/0x4e0
> [ 1168.183172][    C1]  ? interrupt_entry+0xe4/0xf0
> [ 1168.187846][    C1]  ? trace_hardirqs_off_caller+0x8d/0x1f0
> [ 1168.193478][    C1]  ? trace_hardirqs_on_caller+0x1f0/0x1f0
> [ 1168.199116][    C1]  ? _nohz_idle_balance+0x221/0x360
> [ 1168.204228][    C1]  ? trace_hardirqs_off_thunk+0x1a/0x1c
> [ 1168.209690][    C1]  call_function_single_interrupt+0xf/0x20

  reply	other threads:[~2020-05-20 12:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-20  3:58 Endless soft-lockups for compiling workload since next-20200519 Qian Cai
2020-05-20  3:58 ` Qian Cai
2020-05-20 12:50 ` Peter Zijlstra [this message]
2020-05-20 12:50   ` Peter Zijlstra
2020-05-20 14:06   ` Qian Cai
2020-05-20 14:06     ` Qian Cai
2020-05-21  0:40   ` Frederic Weisbecker
2020-05-21  0:40     ` Frederic Weisbecker
2020-05-21  9:39     ` Peter Zijlstra
2020-05-21  9:39       ` Peter Zijlstra
2020-05-21 10:49       ` Peter Zijlstra
2020-05-21 10:49         ` Peter Zijlstra
2020-05-21 11:00         ` Peter Zijlstra
2020-05-21 11:00           ` Peter Zijlstra
2020-05-21 12:41           ` Frederic Weisbecker
2020-05-21 12:41             ` Frederic Weisbecker
2020-05-25 13:21             ` Peter Zijlstra
2020-05-25 13:21               ` Peter Zijlstra
2020-05-25 14:05               ` Frederic Weisbecker
2020-05-25 14:05                 ` Frederic Weisbecker
2020-05-25 14:38                 ` Peter Zijlstra
2020-05-25 14:38                   ` Peter Zijlstra
2020-05-25 14:17               ` Peter Zijlstra
2020-05-25 14:17                 ` Peter Zijlstra
2020-05-22  2:00       ` Qian Cai
2020-05-22  2:00         ` Qian Cai
2020-05-21 10:10     ` Peter Zijlstra
2020-05-21 10:10       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200520125056.GC325280@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bp@alien8.de \
    --cc=cai@lca.pw \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulmck@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.