All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Baron <jbaron@redhat.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	tj@kernel.org
Subject: Re: [PATCH][GIT PULL] tracing: Fix compile issue for trace_sched_wakeup.c
Date: Thu, 21 Oct 2010 21:44:41 -0400	[thread overview]
Message-ID: <20101022014441.GA1948@redhat.com> (raw)
In-Reply-To: <20101021112614.GB26984@elte.hu>

On Thu, Oct 21, 2010 at 01:26:14PM +0200, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Wed, 2010-10-20 at 17:40 +0200, Ingo Molnar wrote:
> > > FYI, there's a new mystery hang (sometimes crash) that triggers in -tip - and which 
> > > seems to be tracing related. See the crashlog below - config attached.
> > > 
> > > It's not bisectable - small changes in the kernel make the bug come/go. (might be a 
> > > race of some sorts)
> > > 
> > 
> > 
> > > [   42.324027] Testing all events: 
> > > [  245.668090] INFO: task swapper:1 blocked for more than 120 seconds.
> > > [  245.672051] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > [  245.676026] swapper       D f6420b40  6544     1      0 0x00000000
> > > [  245.684051]  f6437dac 00000046 f694aac0 f6420b40 f6438000 f6437d74 f6438294 f6438290
> > > [  245.692237]  c2192ac0 c204e6c0 c2192ac0 c2192ac0 f6438290 00000000 f6438000 ff2ffa7d
> > > [  245.701068]  00000009 f6420b40 f6437e5c 7fffffff f6438000 f6437dfc f6437e5c 7fffffff
> > > [  245.709071] Call Trace:
> > > [  245.711551]  [<c1a7f561>] schedule_timeout+0x1c/0x1e7
> > > [  245.712036]  [<c1a818b6>] ? _raw_spin_unlock_irq+0x2d/0x43
> > > [  245.716037]  [<c1027f2d>] ? sub_preempt_count+0x4/0x98
> > > [  245.720061]  [<c1a818b6>] ? _raw_spin_unlock_irq+0x2d/0x43
> > > [  245.724036]  [<c1027fb4>] ? sub_preempt_count+0x8b/0x98
> > > [  245.728036]  [<c1a7e76b>] wait_for_common+0xc1/0x11a
> > > [  245.732062]  [<c102de32>] ? default_wake_function+0x0/0x12
> > > [  245.736041]  [<c1a7e863>] wait_for_completion+0x17/0x19
> > > [  245.740069]  [<c10667a2>] __stop_cpus+0xdd/0x103
> > > [  245.744072]  [<c1a7e6db>] ? wait_for_common+0x31/0x11a
> > > [  245.748040]  [<c10665a4>] ? stop_machine_cpu_stop+0x0/0x9a
> > > [  245.752040]  [<c106683d>] stop_cpus+0x2c/0x3f
> > > [  245.756069]  [<c10668af>] __stop_machine+0x5f/0x67
> > > [  245.760186]  [<c1006240>] ? stop_machine_text_poke+0x0/0x43
> > > [  245.764040]  [<c1006240>] ? stop_machine_text_poke+0x0/0x43
> > > [  245.768071]  [<c19f0a73>] ? cfdgml_create+0x2b/0xde
> > > [  245.772040]  [<c10060fd>] text_poke_smp+0x3a/0x42
> > > [  245.776039]  [<c19f0a73>] ? cfdgml_create+0x2b/0xde
> > 
> > 
> > > [  245.780098]  [<c1005b9c>] arch_jump_label_transform+0x53/0x67
> > > [  245.784042]  [<c104ef0d>] jump_label_update+0x49/0x98
> > 
> > Looks like this code had jump labels enabled. Do you have a dump where
> > they are not enabled?
> 
> No. Good find - and the timeline agrees too, these crashes started triggering when i 
> pulled jump labels from you.
> 
> Thanks,
> 
> 	Ingo

Hi,

(adding Tejun to the 'cc list)

I finally found that we actually continue to run after the above
apparent 'hang'. That is, we continue to make progress updating the jump
labels. And doing a dump of all the system tasks at the time of the hang
showed the processes in various places besides the stop machine threads.
Thus, I thought that perhaps, for some reason the stop machine threads
weren't being scheduled.

Thus, I tried commenting out the special scheduling that is set up for
stop machine threads, and that fixed the hang. I haven't yet looked into
what might be going wrong with that scheduling...but maybe somebody else
knows...

thanks,

-Jason 


diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 090c288..3013b85 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -307,7 +307,7 @@ static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
 			return NOTIFY_BAD;
 		get_task_struct(p);
 		kthread_bind(p, cpu);
-		sched_set_stop_task(cpu, p);
+		//sched_set_stop_task(cpu, p);
 		stopper->thread = p;
 		break;
 
@@ -326,7 +326,7 @@ static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
 	{
 		struct cpu_stop_work *work;
 
-		sched_set_stop_task(cpu, NULL);
+		//sched_set_stop_task(cpu, NULL);
 		/* kill the stopper */
 		kthread_stop(stopper->thread);
 		/* drain remaining works */

  parent reply	other threads:[~2010-10-22  1:45 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-19 17:11 [PATCH][GIT PULL] tracing: Fix compile issue for trace_sched_wakeup.c Steven Rostedt
2010-10-19 18:41 ` Ingo Molnar
2010-10-20 15:40   ` Ingo Molnar
2010-10-20 16:37     ` Steven Rostedt
2010-10-20 18:40       ` Ingo Molnar
2010-10-20 16:43     ` Jason Baron
2010-10-20 18:33       ` Ingo Molnar
2010-10-21 11:09         ` Ingo Molnar
2010-10-22 17:58           ` Jason Baron
2010-10-22 18:24             ` Ingo Molnar
2010-10-22 18:39               ` Jason Baron
2010-10-23 20:02                 ` Ingo Molnar
2010-10-24  0:53                   ` Steven Rostedt
2010-10-24 11:25                     ` Ingo Molnar
2010-10-25  8:59                       ` Ingo Molnar
2010-10-25  9:30                         ` Ingo Molnar
2010-10-25 11:45                           ` Ingo Molnar
2010-10-25 12:10                             ` Ingo Molnar
2010-10-25 12:18                               ` Peter Zijlstra
2010-10-25 12:32                                 ` Ingo Molnar
2010-10-25 15:47                                 ` Peter Zijlstra
2010-10-25 16:07                                   ` Peter Zijlstra
2010-10-25 17:25                                   ` Ingo Molnar
2010-10-25 17:32                                     ` Ingo Molnar
2010-10-25 17:45                                   ` Peter Zijlstra
2010-10-25 17:52                                     ` Jason Baron
2010-10-30 10:42                                     ` [tip:perf/urgent] jump label: Add work around to i386 gcc asm goto bug tip-bot for Steven Rostedt
2010-10-25 15:55                   ` [PATCH][GIT PULL] tracing: Fix compile issue for trace_sched_wakeup.c Jason Baron
2010-10-25 16:09                     ` Peter Zijlstra
2010-10-22 21:42               ` Jason Baron
2010-10-23  4:41                 ` Steven Rostedt
2010-10-21  2:58       ` Masami Hiramatsu
2010-10-21  7:22         ` Peter Zijlstra
2010-10-21 11:01           ` Steven Rostedt
2010-10-21 11:03             ` Peter Zijlstra
2010-10-21 12:45               ` Steven Rostedt
2010-10-21 13:50               ` Jason Baron
2010-10-22  4:56               ` Masami Hiramatsu
2010-10-21 14:00         ` Jason Baron
2010-10-21 11:14     ` Steven Rostedt
2010-10-21 11:26       ` Ingo Molnar
2010-10-21 13:55         ` Jason Baron
2010-10-21 14:43           ` Ingo Molnar
2010-10-22  1:44         ` Jason Baron [this message]
2010-10-22  8:14           ` Peter Zijlstra
2010-10-22 14:13             ` Jason Baron
2010-10-22 14:23               ` Peter Zijlstra
2010-10-22 14:36                 ` Steven Rostedt
2010-10-22 14:36                 ` Jason Baron
2010-10-22  8:16           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101022014441.GA1948@redhat.com \
    --to=jbaron@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.