All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, tglx@linutronix.de,
	linux-kernel@vger.kernel.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	frederic@kernel.org
Subject: Re: [PATCH 0/6] sched: TTWU, IPI, and assorted stuff
Date: Tue, 16 Jun 2020 10:51:21 -0700	[thread overview]
Message-ID: <20200616175121.GD2723@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200616170410.GL2554@hirez.programming.kicks-ass.net>

On Tue, Jun 16, 2020 at 07:04:10PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 09:11:58PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 15, 2020 at 10:21:49AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jun 15, 2020 at 06:40:48PM +0200, Peter Zijlstra wrote:
> > 
> > > > Thanks! I've got 16*TREE03 running since this morning, so far so nothing :/
> > > > (FWIW that's 16/9 times overcommit, idle time fluctuates around 10%).
> > > 
> > > My large system as large remote memory latencies, as in the better part
> > > of a microsecond.  My small system is old (Haswell).  So, just to grasp
> > > at the obvious straw, do you have access to a multisocket Haswell system?
> > 
> > I've been running this on a 4 socket haswell ex.
> 
> Today, with patch 1 commented out, after ~5h20 I finally managed to trigger a splat.
> 
> Let me go stare at it, see if it wants to yield it sekrets

This splat looks very very familiar, so good show!  ;-)

							Thanx, Paul

> [19324.742887] BUG: kernel NULL pointer dereference, address: 0000000000000150
> [19324.744075] #PF: supervisor read access in kernel mode
> [19324.744919] #PF: error_code(0x0000) - not-present page
> [19324.745786] PGD 0 P4D 0
> [19324.746215] Oops: 0000 [#1] PREEMPT SMP PTI
> [19324.746948] CPU: 10 PID: 76 Comm: ksoftirqd/10 Tainted: G        W         5.8.0-rc1+ #8
> [19324.748080] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
> [19324.749372] RIP: 0010:check_preempt_wakeup+0xad/0x1a0
> [19324.750218] Code: d0 39 d0 7d 2c 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 74 1e 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 48 39 bb 50 01 00 00 75 e2 48 85 ff 74 dd e8
> [19324.753364] RSP: 0000:ffffb3cb40320f50 EFLAGS: 00010087
> [19324.754255] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffb400bce0
> [19324.755465] RDX: 0000000000000000 RSI: ffff93c1dbed5b00 RDI: ffff93c1df4a8380
> [19324.756682] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff93c1df2e83b0
> [19324.757848] R10: 0000000000000001 R11: 0000000000000335 R12: 0000000000000001
> [19324.758453] smpboot: CPU 11 is now offline
> [19324.759099] R13: ffff93c1dcf48000 R14: ffff93c1df4a8340 R15: ffff93c1df4a8340
> [19324.761167] FS:  0000000000000000(0000) GS:ffff93c1df480000(0000) knlGS:0000000000000000
> [19324.762559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [19324.763527] CR2: 0000000000000150 CR3: 000000001e40a000 CR4: 00000000000006e0
> [19324.764726] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [19324.765929] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [19324.767100] Call Trace:
> [19324.767516]  <IRQ>
> [19324.767875]  check_preempt_curr+0x62/0x90
> [19324.768586]  ttwu_do_wakeup.constprop.0+0xf/0x100
> [19324.769407]  sched_ttwu_pending+0xa9/0xe0
> [19324.770077]  __sysvec_call_function_single+0x28/0xe0
> [19324.770926]  asm_call_on_stack+0x12/0x20
> [19324.771594]  </IRQ>
> [19324.771951]  sysvec_call_function_single+0x94/0xd0
> [19324.772596]  asm_sysvec_call_function_single+0x12/0x20
> [19324.773254] RIP: 0010:_raw_spin_unlock_irqrestore+0x5/0x30
> [19324.774169] Code: e4 49 ff c3 90 c6 07 00 bf 01 00 00 00 e8 23 2d 53 ff 65 8b 05 cc 32 4b 4c 85 c0 74 01 c3 e8 b9 e4 49 ff c3 90 c6 07 00 56 9d <bf> 01 00 00 00 e8 01 2d 53 ff 65 8b 05 aa 32 4b 4c 85 c0 74 01 c3
> [19324.777267] RSP: 0000:ffffb3cb4030bd58 EFLAGS: 00000287
> [19324.777956] RAX: 0000000000000001 RBX: ffff93c1dbed5b00 RCX: ffff93c1dcd63400
> [19324.779015] RDX: 0000000000000000 RSI: 0000000000000287 RDI: ffff93c1dbed6284
> [19324.780067] RBP: 000000000000000a R08: 00001193646cd91c R09: ffff93c1df49c008
> [19324.781192] R10: ffffb3cb4030bdf8 R11: 000000000000032e R12: 0000000000000000
> [19324.782386] R13: 0000000000000287 R14: ffff93c1dbed6284 R15: ffff93c1df2e8340
> [19324.783565]  try_to_wake_up+0x232/0x530
> [19324.784057]  ? trace_raw_output_hrtimer_start+0x70/0x70
> [19324.784977]  call_timer_fn+0x28/0x150
> [19324.785606]  ? trace_raw_output_hrtimer_start+0x70/0x70
> [19324.786486]  run_timer_softirq+0x182/0x250
> [19324.787191]  ? set_next_entity+0x8b/0x1a0
> [19324.787867]  ? _raw_spin_unlock_irq+0xe/0x20
> [19324.788597]  ? finish_task_switch+0x7b/0x230
> [19324.789338]  __do_softirq+0xfc/0x32b
> [19324.789961]  ? smpboot_register_percpu_thread+0xd0/0xd0
> [19324.790904]  run_ksoftirqd+0x21/0x30
> [19324.791510]  smpboot_thread_fn+0x195/0x230
> [19324.792203]  kthread+0x13d/0x160
> [19324.792731]  ? kthread_create_worker_on_cpu+0x60/0x60
> [19324.793576]  ret_from_fork+0x22/0x30
> [19324.794186] Modules linked in:
> [19324.794729] CR2: 0000000000000150
> [19324.795303] ------------[ cut here ]------------
> [19324.795304] WARNING: CPU: 10 PID: 76 at kernel/smp.c:138 __smp_call_single_queue+0x40/0x50
> [19324.795305] Modules linked in:
> [19324.795306] CPU: 10 PID: 76 Comm: ksoftirqd/10 Not tainted 5.8.0-rc1+ #8
> [19324.795307] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
> [19324.795307] RIP: 0010:__smp_call_single_queue+0x40/0x50
> [19324.795308] Code: c2 40 91 02 00 4c 89 e6 4c 89 e7 48 03 14 c5 e0 56 2d b4 e8 b2 3a 2f 00 84 c0 75 04 5d 41 5c c3 89 ef 5d 41 5c e9 40 af f9 ff <0f> 0b eb cd 66 66 2e 0f 1f 84 00 00 00 00 00 90 41 54 49 89 f4 55
> [19324.795309] RSP: 0000:ffffb3cb4030bd18 EFLAGS: 00010046
> [19324.795310] RAX: 000000000000000a RBX: 0000000000000000 RCX: 00000000ffffffff
> [19324.795310] RDX: 00000000000090aa RSI: ffffffffb420bc3f RDI: ffffffffb4232e3e
> [19324.795311] RBP: 000000000000000a R08: 00001193646cd91c R09: ffff93c1df49c008
> [19324.795312] R10: ffffb3cb4030bdf8 R11: 000000000000032e R12: ffff93c1dbed5b30
> [19324.795312] R13: ffff93c1df4a8340 R14: 000000000000000a R15: ffff93c1df2e8340
> [19324.795313] FS:  0000000000000000(0000) GS:ffff93c1df480000(0000) knlGS:0000000000000000
> [19324.795313] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [19324.795314] CR2: 00000000ffffffff CR3: 000000001e40a000 CR4: 00000000000006e0
> [19324.795315] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [19324.795315] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [19324.795316] Call Trace:
> [19324.795316]  ttwu_queue_wakelist+0xa4/0xc0
> [19324.795316]  try_to_wake_up+0x432/0x530
> [19324.795317]  ? trace_raw_output_hrtimer_start+0x70/0x70
> [19324.795317]  call_timer_fn+0x28/0x150
> [19324.795318]  ? trace_raw_output_hrtimer_start+0x70/0x70
> [19324.795318]  run_timer_softirq+0x182/0x250
> [19324.795319]  ? set_next_entity+0x8b/0x1a0
> [19324.795319]  ? _raw_spin_unlock_irq+0xe/0x20
> [19324.795319]  ? finish_task_switch+0x7b/0x230
> [19324.795320]  __do_softirq+0xfc/0x32b
> [19324.795320]  ? smpboot_register_percpu_thread+0xd0/0xd0
> [19324.795321]  run_ksoftirqd+0x21/0x30
> [19324.795321]  smpboot_thread_fn+0x195/0x230
> [19324.795321]  kthread+0x13d/0x160
> [19324.795322]  ? kthread_create_worker_on_cpu+0x60/0x60
> [19324.795322]  ret_from_fork+0x22/0x30
> [19324.795323] ---[ end trace 851fe1f1f7a85d8b ]---
> [19324.828475] ---[ end trace 851fe1f1f7a85d8c ]---
> [19324.829250] RIP: 0010:check_preempt_wakeup+0xad/0x1a0
> [19324.830107] Code: d0 39 d0 7d 2c 83 ea 01 48 8b 9b 48 01 00 00 39 d0 75 f2 48 39 bb 50 01 00 00 74 1e 48 8b ad 48 01 00 00 48 8b 9b 48 01 00 00 <48> 8b bd 50 01 00 00 48 39 bb 50 01 00 00 75 e2 48 85 ff 74 dd e8
> [19324.833208] RSP: 0000:ffffb3cb40320f50 EFLAGS: 00010087
> [19324.834098] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffb400bce0
> [19324.835272] RDX: 0000000000000000 RSI: ffff93c1dbed5b00 RDI: ffff93c1df4a8380
> [19324.836466] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff93c1df2e83b0
> [19324.837669] R10: 0000000000000001 R11: 0000000000000335 R12: 0000000000000001
> [19324.838867] R13: ffff93c1dcf48000 R14: ffff93c1df4a8340 R15: ffff93c1df4a8340
> [19324.840019] FS:  0000000000000000(0000) GS:ffff93c1df480000(0000) knlGS:0000000000000000
> [19324.841316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [19324.842242] CR2: 0000000000000150 CR3: 000000001e40a000 CR4: 00000000000006e0
> [19324.843406] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [19324.844568] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [19324.845710] Kernel panic - not syncing: Fatal exception in interrupt
> [19324.846998] Kernel Offset: 0x32000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [19324.848713] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

      parent reply	other threads:[~2020-06-16 17:51 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-15 12:56 [PATCH 0/6] sched: TTWU, IPI, and assorted stuff Peter Zijlstra
2020-06-15 12:56 ` [PATCH 1/6] sched: Fix ttwu_queue_cond() Peter Zijlstra
2020-06-15 13:34   ` Peter Zijlstra
2020-06-15 16:45     ` Paul E. McKenney
2020-06-15 22:58       ` Paul E. McKenney
2020-06-22  9:11   ` Mel Gorman
2020-06-22  9:41     ` Peter Zijlstra
2020-06-15 12:56 ` [PATCH 2/6] sched: Verify some SMP assumptions Peter Zijlstra
2020-06-15 12:56 ` [PATCH 3/6] sched: s/WF_ON_RQ/WQ_ON_CPU/ Peter Zijlstra
2020-06-22  9:13   ` Mel Gorman
2020-06-15 12:56 ` [PATCH 4/6] smp, irq_work: Continue smp_call_function*() and irq_work*() integration Peter Zijlstra
2020-06-15 12:56 ` [PATCH 5/6] irq_work: Cleanup Peter Zijlstra
2020-06-16 15:16   ` Petr Mladek
2020-06-15 12:57 ` [PATCH 6/6] smp: Cleanup smp_call_function*() Peter Zijlstra
2020-06-15 14:34   ` Jens Axboe
2020-06-15 16:04   ` Daniel Thompson
2020-06-17  8:23   ` Christoph Hellwig
2020-06-17  9:00     ` Peter Zijlstra
2020-06-17 11:04     ` Peter Zijlstra
2020-06-18  6:51       ` Christoph Hellwig
2020-06-18 16:25         ` Peter Zijlstra
2020-06-15 16:23 ` [PATCH 0/6] sched: TTWU, IPI, and assorted stuff Paul E. McKenney
2020-06-15 16:40   ` Peter Zijlstra
2020-06-15 17:21     ` Paul E. McKenney
2020-06-15 19:11       ` Peter Zijlstra
2020-06-15 19:55         ` Paul E. McKenney
2020-06-16 16:31           ` Paul E. McKenney
2020-06-16 17:04         ` Peter Zijlstra
2020-06-16 17:17           ` Peter Zijlstra
2020-06-16 17:53             ` Paul E. McKenney
2020-06-19 13:44             ` Peter Zijlstra
2020-06-19 17:20               ` Paul E. McKenney
2020-06-19 17:48                 ` Paul E. McKenney
2020-06-19 18:11                   ` Peter Zijlstra
2020-06-19 18:46                     ` Paul E. McKenney
2020-06-20 18:46               ` Paul E. McKenney
2020-06-16 17:51           ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200616175121.GD2723@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.