All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <umgwanakikbuti@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH RT 4/6] rt/locking: Reenable migration accross schedule
Date: Fri, 25 Mar 2016 10:13:16 +0100	[thread overview]
Message-ID: <1458897196.3870.8.camel@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1603250951100.3978@nanos>

On Fri, 2016-03-25 at 09:52 +0100, Thomas Gleixner wrote:
> On Fri, 25 Mar 2016, Mike Galbraith wrote:
> > On Thu, 2016-03-24 at 12:06 +0100, Mike Galbraith wrote:
> > > On Thu, 2016-03-24 at 11:44 +0100, Thomas Gleixner wrote:
> > > >  
> > > > > On the bright side, with the busted migrate enable business reverted,
> > > > > plus one dinky change from me [1], master-rt.today has completed 100
> > > > > iterations of Steven's hotplug stress script along side endless
> > > > > futexstress, and is happily doing another 900 as I write this, so the
> > > > > next -rt should finally be hotplug deadlock free.
> > > > > 
> > > > > Thomas's state machinery seems to work wonders.  'course this being
> > > > > hotplug, the other shoe will likely apply itself to my backside soon.
> > > > 
> > > > That's a given :)
> > > 
> > > blk-mq applied it shortly after I was satisfied enough to poke xmit.
> > 
> > The other shoe is that notifiers can depend upon RCU grace periods, so
> > when pin_current_cpu() snags rcu_sched, the hotplug game is over.
> > 
> > blk_mq_queue_reinit_notify:
> >         /*
> >          * We need to freeze and reinit all existing queues.  Freezing
> >          * involves synchronous wait for an RCU grace period and doing it
> >          * one by one may take a long time.  Start freezing all queues in
> >          * one swoop and then wait for the completions so that freezing can
> >          * take place in parallel.
> >          */
> >         list_for_each_entry(q, &all_q_list, all_q_node)
> >                 blk_mq_freeze_queue_start(q);
> >         list_for_each_entry(q, &all_q_list, all_q_node) {
> >                 blk_mq_freeze_queue_wait(q);
> 
> Yeah, I stumbled over that already when analysing all the hotplug notifier
> sites. That's definitely a horrible one.
>  
> > Hohum (sharpens rock), next.
> 
> /me recommends frozen sharks

With the sharp rock below and the one I'll follow up with, master-rt on
my DL980 just passed 3 hours of endless hotplug stress concurrent with
endless tbench 8, stockfish and futextest.  It has never survived this
long with this load by a long shot.

hotplug/rt: Do not let pin_current_cpu() block RCU grace periods

Notifiers may depend upon grace periods continuing to advance
as blk_mq_queue_reinit_notify() below.

crash> bt ffff8803aee76400
PID: 1113   TASK: ffff8803aee76400  CPU: 0   COMMAND: "stress-cpu-hotp"
 #0 [ffff880396fe7ad8] __schedule at ffffffff816b7142
 #1 [ffff880396fe7b28] schedule at ffffffff816b797b
 #2 [ffff880396fe7b48] blk_mq_freeze_queue_wait at ffffffff8135c5ac
 #3 [ffff880396fe7b80] blk_mq_queue_reinit_notify at ffffffff8135f819
 #4 [ffff880396fe7b98] notifier_call_chain at ffffffff8109b8ed
 #5 [ffff880396fe7bd8] __raw_notifier_call_chain at ffffffff8109b91e
 #6 [ffff880396fe7be8] __cpu_notify at ffffffff81072825
 #7 [ffff880396fe7bf8] cpu_notify_nofail at ffffffff81072b15
 #8 [ffff880396fe7c08] notify_dead at ffffffff81072d06
 #9 [ffff880396fe7c38] cpuhp_invoke_callback at ffffffff81073718
#10 [ffff880396fe7c78] cpuhp_down_callbacks at ffffffff81073a70
#11 [ffff880396fe7cb8] _cpu_down at ffffffff816afc71
#12 [ffff880396fe7d38] do_cpu_down at ffffffff8107435c
#13 [ffff880396fe7d60] cpu_down at ffffffff81074390
#14 [ffff880396fe7d70] cpu_subsys_offline at ffffffff814cd854
#15 [ffff880396fe7d80] device_offline at ffffffff814c7cda
#16 [ffff880396fe7da8] online_store at ffffffff814c7dd0
#17 [ffff880396fe7dd0] dev_attr_store at ffffffff814c4fc8
#18 [ffff880396fe7de0] sysfs_kf_write at ffffffff812cfbe4
#19 [ffff880396fe7e08] kernfs_fop_write at ffffffff812cf172
#20 [ffff880396fe7e50] __vfs_write at ffffffff81241428
#21 [ffff880396fe7ed0] vfs_write at ffffffff81242535
#22 [ffff880396fe7f10] sys_write at ffffffff812438f9
#23 [ffff880396fe7f50] entry_SYSCALL_64_fastpath at ffffffff816bb4bc
    RIP: 00007fafd918acd0  RSP: 00007ffd2ca956e8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000000000226a770  RCX: 00007fafd918acd0
    RDX: 0000000000000002  RSI: 00007fafd9cb9000  RDI: 0000000000000001
    RBP: 00007ffd2ca95700   R8: 000000000000000a   R9: 00007fafd9cb3700
    R10: 00000000ffffffff  R11: 0000000000000246  R12: 0000000000000007
    R13: 0000000000000001  R14: 0000000000000009  R15: 000000000000000a
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

blk_mq_queue_reinit_notify:
        /*
         * We need to freeze and reinit all existing queues.  Freezing
         * involves synchronous wait for an RCU grace period and doing it
         * one by one may take a long time.  Start freezing all queues in
         * one swoop and then wait for the completions so that freezing can
         * take place in parallel.
         */
        list_for_each_entry(q, &all_q_list, all_q_node)
                blk_mq_freeze_queue_start(q);
        list_for_each_entry(q, &all_q_list, all_q_node) {
                blk_mq_freeze_queue_wait(q);

crash> bt ffff880176cc9900
PID: 17     TASK: ffff880176cc9900  CPU: 0   COMMAND: "rcu_sched"
 #0 [ffff880176cd7ab8] __schedule at ffffffff816b7142
 #1 [ffff880176cd7b08] schedule at ffffffff816b797b
 #2 [ffff880176cd7b28] rt_spin_lock_slowlock at ffffffff816b974d
 #3 [ffff880176cd7bc8] rt_spin_lock_fastlock at ffffffff811b0f3c
 #4 [ffff880176cd7be8] rt_spin_lock__no_mg at ffffffff816bac1b
 #5 [ffff880176cd7c08] pin_current_cpu at ffffffff8107406a
 #6 [ffff880176cd7c50] migrate_disable at ffffffff810a0e9e
 #7 [ffff880176cd7c70] rt_spin_lock at ffffffff816bad69
 #8 [ffff880176cd7c90] lock_timer_base at ffffffff810fc5e8
 #9 [ffff880176cd7cc8] try_to_del_timer_sync at ffffffff810fe290
#10 [ffff880176cd7cf0] del_timer_sync at ffffffff810fe381
#11 [ffff880176cd7d58] schedule_timeout at ffffffff816b9e4b
#12 [ffff880176cd7df0] rcu_gp_kthread at ffffffff810f52b4
#13 [ffff880176cd7e70] kthread at ffffffff8109a02f
#14 [ffff880176cd7f50] ret_from_fork at ffffffff816bb6f2

Game Over.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
---
 include/linux/sched.h |    1 +
 kernel/cpu.c          |    2 +-
 kernel/rcu/tree.c     |    3 +++
 3 files changed, 5 insertions(+), 1 deletion(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1492,6 +1492,7 @@ struct task_struct {
 #ifdef CONFIG_COMPAT_BRK
 	unsigned brk_randomized:1;
 #endif
+	unsigned sched_is_rcu:1; /* RT: is a critical RCU thread */
 
 	unsigned long atomic_flags; /* Flags needing atomic access. */
 
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -156,7 +156,7 @@ void pin_current_cpu(void)
 	hp = this_cpu_ptr(&hotplug_pcp);
 
 	if (!hp->unplug || hp->refcount || force || preempt_count() > 1 ||
-	    hp->unplug == current) {
+	    hp->unplug == current || current->sched_is_rcu) {
 		hp->refcount++;
 		return;
 	}
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2100,6 +2100,9 @@ static int __noreturn rcu_gp_kthread(voi
 	struct rcu_state *rsp = arg;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	/* RT: pin_current_cpu() MUST NOT block RCU grace periods. */
+	current->sched_is_rcu = 1;
+
 	rcu_bind_gp_kthread();
 	for (;;) {
 

  reply	other threads:[~2016-03-25  9:13 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-12 23:02 [PATCH RT 1/6] kernel: softirq: unlock with irqs on Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 2/6] kernel: migrate_disable() do fastpath in atomic & irqs-off Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 3/6] rtmutex: push down migrate_disable() into rt_spin_lock() Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 4/6] rt/locking: Reenable migration accross schedule Sebastian Andrzej Siewior
2016-03-20  8:43   ` Mike Galbraith
2016-03-24 10:07     ` Mike Galbraith
2016-03-24 10:44       ` Thomas Gleixner
2016-03-24 11:06         ` Mike Galbraith
2016-03-25  5:38           ` Mike Galbraith
2016-03-25  8:52             ` Thomas Gleixner
2016-03-25  9:13               ` Mike Galbraith [this message]
2016-03-25  9:14                 ` Mike Galbraith
2016-03-25 16:24                 ` Mike Galbraith
2016-03-29  4:05                   ` Mike Galbraith
2016-03-31  6:31         ` Mike Galbraith
2016-04-01 21:11           ` Sebastian Andrzej Siewior
2016-04-02  3:12             ` Mike Galbraith
2016-04-05 12:49               ` [rfc patch 0/2] Kill hotplug_lock()/hotplug_unlock() Mike Galbraith
     [not found]               ` <1459837988.26938.16.camel@gmail.com>
2016-04-05 12:49                 ` [rfc patch 1/2] rt/locking/hotplug: " Mike Galbraith
2016-04-05 12:49                 ` [rfc patch 2/2] rt/locking/hotplug: Fix rt_spin_lock_slowlock() migrate_disable() bug Mike Galbraith
2016-04-06 12:00                   ` Mike Galbraith
2016-04-07  4:37                     ` Mike Galbraith
2016-04-07 16:48                       ` Sebastian Andrzej Siewior
2016-04-07 19:08                         ` Mike Galbraith
2016-04-07 16:47               ` [PATCH RT 4/6] rt/locking: Reenable migration accross schedule Sebastian Andrzej Siewior
2016-04-07 19:04                 ` Mike Galbraith
2016-04-08 10:30                   ` Sebastian Andrzej Siewior
2016-04-08 12:10                     ` Mike Galbraith
2016-04-08  6:35                 ` Mike Galbraith
2016-04-08 13:44                 ` Mike Galbraith
2016-04-08 13:44                   ` Mike Galbraith
2016-04-08 13:58                   ` Sebastian Andrzej Siewior
2016-04-08 14:16                     ` Mike Galbraith
2016-04-08 14:51                       ` Sebastian Andrzej Siewior
2016-04-08 16:49                         ` Mike Galbraith
2016-04-18 17:15                           ` Sebastian Andrzej Siewior
2016-04-18 17:55                             ` Mike Galbraith
2016-04-19  7:07                               ` Sebastian Andrzej Siewior
2016-04-19  8:55                                 ` Mike Galbraith
2016-04-19  9:02                                   ` Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 5/6] kernel/stop_machine: partly revert "stop_machine: Use raw spinlocks" Sebastian Andrzej Siewior
2016-02-12 23:02 ` [PATCH RT 6/6] rcu: disable more spots of rcu_bh Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1458897196.3870.8.camel@gmail.com \
    --to=umgwanakikbuti@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.