All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule
Date: Fri, 25 Mar 2016 17:24:33 +0100	[thread overview]
Message-ID: <1458923073.3849.26.camel@gmail.com> (raw)
In-Reply-To: <1458897196.3870.8.camel@gmail.com>

On Fri, 2016-03-25 at 10:13 +0100, Mike Galbraith wrote:
> On Fri, 2016-03-25 at 09:52 +0100, Thomas Gleixner wrote:
> > On Fri, 25 Mar 2016, Mike Galbraith wrote:
> > > On Thu, 2016-03-24 at 12:06 +0100, Mike Galbraith wrote:
> > > > On Thu, 2016-03-24 at 11:44 +0100, Thomas Gleixner wrote:
> > > > >  
> > > > > > On the bright side, with the busted migrate enable business reverted,
> > > > > > plus one dinky change from me [1], master-rt.today has completed 100
> > > > > > iterations of Steven's hotplug stress script along side endless
> > > > > > futexstress, and is happily doing another 900 as I write this, so the
> > > > > > next -rt should finally be hotplug deadlock free.
> > > > > > 
> > > > > > Thomas's state machinery seems to work wonders.  'course this being
> > > > > > hotplug, the other shoe will likely apply itself to my backside soon.
> > > > > 
> > > > > That's a given :)
> > > > 
> > > > blk-mq applied it shortly after I was satisfied enough to poke xmit.
> > > 
> > > The other shoe is that notifiers can depend upon RCU grace periods, so
> > > when pin_current_cpu() snags rcu_sched, the hotplug game is over.
> > > 
> > > blk_mq_queue_reinit_notify:
> > >         /*
> > >          * We need to freeze and reinit all existing queues.  Freezing
> > >          * involves synchronous wait for an RCU grace period and doing it
> > >          * one by one may take a long time.  Start freezing all queues in
> > >          * one swoop and then wait for the completions so that freezing can
> > >          * take place in parallel.
> > >          */
> > >         list_for_each_entry(q, &all_q_list, all_q_node)
> > >                 blk_mq_freeze_queue_start(q);
> > >         list_for_each_entry(q, &all_q_list, all_q_node) {
> > >                 blk_mq_freeze_queue_wait(q);
> > 
> > Yeah, I stumbled over that already when analysing all the hotplug notifier
> > sites. That's definitely a horrible one.
> >  
> > > Hohum (sharpens rock), next.
> > 
> > /me recommends frozen sharks
> 
> With the sharp rock below and the one I'll follow up with, master-rt on
> my DL980 just passed 3 hours of endless hotplug stress concurrent with
> endless tbench 8, stockfish and futextest.  It has never survived this
> long with this load by a long shot.

I knew it was unlikely to surrender that quickly.  Oh well, on the
bright side it seems to be running low on deadlocks.

	Happy Easter,

	-Mike

(bite me beast, 666 indeed)

[26666.886077] ------------[ cut here ]------------
[26666.886078] kernel BUG at kernel/sched/core.c:1717!
[26666.886081] invalid opcode: 0000 [#1] PREEMPT SMP
[26666.886094] Dumping ftrace buffer:
[26666.886112]    (ftrace buffer empty)
[26666.886137] Modules linked in: autofs4 edd af_packet cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_powersave fuse loop md_mod dm_mod vhost_net macvtap macvlan vhost tun ipmi_ssif kvm_intel kvm joydev hid_generic sr_m
od cdrom sg shpchp netxen_nic hpwdt hpilo ipmi_si ipmi_msghandler irqbypass bnx2 iTCO_wdt iTCO_vendor_support gpio_ich pcc_cpufreq fjes i7core_edac edac_core lpc_ich pcspkr 8250_fintek ehci_pci acpi_cpufreq acpi_power_meter button ext4 m
bcache jbd2 crc16 usbhid uhci_hcd ehci_hcd sd_mod usbcore usb_common thermal processor scsi_dh_hp_sw scsi_dh_emc scsi_dh_rdac scsi_dh_alua ata_generic ata_piix libata hpsa scsi_transport_sas cciss scsi_mod
[26666.886140] CPU: 2 PID: 41 Comm: migration/2 Not tainted 4.6.0-rt11 #69
[26666.886140] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 07/07/2010
[26666.886142] task: ffff88017e34e580 ti: ffff88017e394000 task.ti: ffff88017e394000
[26666.886149] RIP: 0010:[<ffffffff810a6f5c>]  [<ffffffff810a6f5c>] select_fallback_rq+0x19c/0x1d0
[26666.886149] RSP: 0018:ffff88017e397d28  EFLAGS: 00010046
[26666.886150] RAX: 0000000000000100 RBX: ffff88017e668348 RCX: 0000000000000003
[26666.886151] RDX: 0000000000000100 RSI: 0000000000000100 RDI: ffffffff81811420
[26666.886152] RBP: ffff88017e668000 R08: 0000000000000003 R09: 0000000000000000
[26666.886153] R10: ffff8802772b3ec0 R11: 0000000000000001 R12: 0000000000000002
[26666.886153] R13: 0000000000000002 R14: ffff88017e398000 R15: ffff88017e668000
[26666.886155] FS:  0000000000000000(0000) GS:ffff880276680000(0000) knlGS:0000000000000000
[26666.886156] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[26666.886156] CR2: 0000000000695c5c CR3: 0000000271419000 CR4: 00000000000006e0
[26666.886157] Stack:
[26666.886159]  ffff880276696900 ffff880276696900 ffff88017e668808 0000000000016900
[26666.886160]  ffffffff810a88f9 ffff88017e398000 ffff88017e398000 0000000000000046
[26666.886161]  ffff88017e34e580 00000000fffffff7 ffffffff81c5be90 0000000000000000
[26666.886162] Call Trace:
[26666.886166]  [<ffffffff810a88f9>] ? migration_call+0x1b9/0x3b0
[26666.886168]  [<ffffffff8109d724>] ? notifier_call_chain+0x44/0x70
[26666.886171]  [<ffffffff8107d430>] ? notify_online+0x20/0x20
[26666.886172]  [<ffffffff8107d381>] ? __cpu_notify+0x31/0x50
[26666.886173]  [<ffffffff8107d448>] ? notify_dying+0x18/0x20
[26666.886175]  [<ffffffff8107dfbf>] ? cpuhp_invoke_callback+0x3f/0x150
[26666.886178]  [<ffffffff8111ed01>] ? cpu_stop_should_run+0x11/0x50
[26666.886180]  [<ffffffff8107e6a2>] ? take_cpu_down+0x52/0x80
[26666.886181]  [<ffffffff8111ee7a>] ? multi_cpu_stop+0x9a/0xc0
[26666.886182]  [<ffffffff8111ede0>] ? cpu_stop_queue_work+0x80/0x80
[26666.886184]  [<ffffffff8111f078>] ? cpu_stopper_thread+0x88/0x120
[26666.886186]  [<ffffffff8109fede>] ? smpboot_thread_fn+0x14e/0x270
[26666.886188]  [<ffffffff8109fd90>] ? smpboot_update_cpumask_percpu_thread+0x130/0x130
[26666.886192]  [<ffffffff8109c68d>] ? kthread+0xbd/0xe0
[26666.886196]  [<ffffffff816097c2>] ? ret_from_fork+0x22/0x40
[26666.886198]  [<ffffffff8109c5d0>] ? kthread_worker_fn+0x160/0x160
[26666.886211] Code: 06 00 00 44 89 e9 48 c7 c7 f8 72 9f 81 31 c0 e8 fd 0a 0e 00 e9 32 ff ff ff 41 83 fc 01 74 21 72 0c 41 83 fc 02 0f 85 33 ff ff ff <0f> 0b 48 89 ef 41 bc 01 00 00 00 e8 54 5c 07 00 e9 1e ff ff ff
[26666.886212] RIP  [<ffffffff810a6f5c>] select_fallback_rq+0x19c/0x1d0
[26666.886213]  RSP <ffff88017e397d28>

  parent reply	other threads:[~2016-03-25 16:24 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-12 23:02 [PATCH RT 1/6] kernel: softirq: unlock with irqs on Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 2/6] kernel: migrate_disable() do fastpath in atomic & irqs-off Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 3/6] rtmutex: push down migrate_disable() into rt_spin_lock() Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 4/6] rt/locking: Reenable migration accross schedule Sebastian Andrzej Siewior
2016-03-20  8:43   ` Mike Galbraith
2016-03-24 10:07     ` Mike Galbraith
2016-03-24 10:44       ` Thomas Gleixner
2016-03-24 11:06         ` Mike Galbraith
2016-03-25  5:38           ` Mike Galbraith
2016-03-25  8:52             ` Thomas Gleixner
2016-03-25  9:13               ` Mike Galbraith
2016-03-25  9:14                 ` Mike Galbraith
2016-03-25 16:24                 ` Mike Galbraith [this message]
2016-03-29  4:05                   ` Mike Galbraith
2016-03-31  6:31         ` Mike Galbraith
2016-04-01 21:11           ` Sebastian Andrzej Siewior
2016-04-02  3:12             ` Mike Galbraith
2016-04-05 12:49               ` [rfc patch 0/2] Kill hotplug_lock()/hotplug_unlock() Mike Galbraith
     [not found]               ` <1459837988.26938.16.camel@gmail.com>
2016-04-05 12:49                 ` [rfc patch 1/2] rt/locking/hotplug: " Mike Galbraith
2016-04-05 12:49                 ` [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug Mike Galbraith
2016-04-06 12:00                   ` Mike Galbraith
2016-04-07  4:37                     ` Mike Galbraith
2016-04-07 16:48                       ` Sebastian Andrzej Siewior
2016-04-07 19:08                         ` Mike Galbraith
2016-04-07 16:47               ` [PATCH RT 4/6] rt/locking: Reenable migration accross schedule Sebastian Andrzej Siewior
2016-04-07 19:04                 ` Mike Galbraith
2016-04-08 10:30                   ` Sebastian Andrzej Siewior
2016-04-08 12:10                     ` Mike Galbraith
2016-04-08  6:35                 ` Mike Galbraith
2016-04-08 13:44                 ` Mike Galbraith
2016-04-08 13:44                   ` Mike Galbraith
2016-04-08 13:58                   ` Sebastian Andrzej Siewior
2016-04-08 14:16                     ` Mike Galbraith
2016-04-08 14:51                       ` Sebastian Andrzej Siewior
2016-04-08 16:49                         ` Mike Galbraith
2016-04-18 17:15                           ` Sebastian Andrzej Siewior
2016-04-18 17:55                             ` Mike Galbraith
2016-04-19  7:07                               ` Sebastian Andrzej Siewior
2016-04-19  8:55                                 ` Mike Galbraith
2016-04-19  9:02                                   ` Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 5/6] kernel/stop_machine: partly revert "stop_machine: Use raw spinlocks" Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 6/6] rcu: disable more spots of rcu_bh Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1458923073.3849.26.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.