All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fengguang Wu <fengguang.wu@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Michael wang <wangyun@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [sched/get_online_cpus] INFO: task swapper/0:1 blocked for more than 120 seconds.
Date: Tue, 12 Nov 2013 17:55:16 +0800	[thread overview]
Message-ID: <20131112095516.GB32441@localhost> (raw)
In-Reply-To: <20131111162022.GB21461@twins.programming.kicks-ass.net>

On Mon, Nov 11, 2013 at 05:20:22PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 11, 2013 at 03:47:11PM +0800, Michael wang wrote:
> > Hi, Fengguang
> > 
> > On 11/10/2013 06:16 PM, Fengguang Wu wrote:
> > > Greetings,
> > > 
> > > I got the below dmesg and the first bad commit is
> > 
> > I guess this will disappear when '!CONFIG_RCU_BOOST'...
> > 
> > AFAIK, if the rsp was in boost mode, we count on smpboot-thread
> > 'rcu_cpu_thread_spec' to finish the callback, which will be
> > parked before do sync-rcu inside _cpu_down(), if that was true,
> > then the sync will never finish...
> > 
> > May be some brainless fix like this?
> > 
> > 
> > 
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 63aa50d..aa24338 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -306,7 +306,6 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
> >                                 __func__, cpu);
> >                 goto out_release;
> >         }
> > -       smpboot_park_threads(cpu);
> >  
> >         /*
> >          * By now we've cleared cpu_active_mask, wait for all preempt-disabled
> > @@ -321,6 +320,8 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
> >  #endif
> >         synchronize_rcu();
> >  
> > +       smpboot_park_threads(cpu);
> > +
> >         /*
> >          * So now all preempt/rcu users must observe !cpu_active().
> >          */
> 
> Good thinking.. Wu did this cure stuff?

Yes, it fixed the problem.

Tested-by: Fengguang Wu <fengguang.wu@intel.com>


/kernel/i386-randconfig-j3-11101308/484f4e66a6a1102edf02407479f6f7632aade0f3

+--------------------------------------------------+--------------+--------------+
|                                                  | e5137b50a064 | 484f4e66a6a1 |
+--------------------------------------------------+--------------+--------------+
| boot_successes                                   | 42           | 100          |
| boot_failures                                    | 58           |              |
| INFO:task_blocked_for_more_than_seconds          | 58           |              |
| Kernel_panic-not_syncing:hung_task:blocked_tasks | 58           |              |
+--------------------------------------------------+--------------+--------------+

/kernel/x86_64-randconfig-x4-1108/484f4e66a6a1102edf02407479f6f7632aade0f3

+------------------------------------------------------------------------------------+-----------+--------------+--------------+
|                                                                                    | v3.12-rc7 | e5137b50a064 | 484f4e66a6a1 |
+------------------------------------------------------------------------------------+-----------+--------------+--------------+
| boot_successes                                                                     | 59        | 34           | 100          |
| has_kernel_error_warning                                                           | 4         |              |              |
| BUG:kernel_early_hang_without_any_printk_output                                    | 4         |              |              |
| boot_failures                                                                      | 0         | 66           |              |
| INFO:task_blocked_for_more_than_seconds                                            | 0         | 66           |              |
| INFO:NMI_handler(arch_trigger_all_cpu_backtrace_handler)took_too_long_to_run:msecs | 0         | 55           |              |
| Kernel_panic-not_syncing:hung_task:blocked_tasks                                   | 0         | 66           |              |
+------------------------------------------------------------------------------------+-----------+--------------+--------------+


  reply	other threads:[~2013-11-12  9:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-10 10:16 [sched/get_online_cpus] INFO: task swapper/0:1 blocked for more than 120 seconds Fengguang Wu
2013-11-11  1:26 ` Fengguang Wu
2013-11-11  7:47 ` Michael wang
2013-11-11 16:20   ` Peter Zijlstra
2013-11-12  9:55     ` Fengguang Wu [this message]
2013-11-13  2:18       ` Michael wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131112095516.GB32441@localhost \
    --to=fengguang.wu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=wangyun@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.