public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>,
	pm list <linux-pm@lists.linux-foundation.org>,
	Greg KH <gregkh@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jesse Barnes <jbarnes@virtuousgeek.org>
Subject: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd
Date: Mon, 9 Nov 2009 15:26:02 +0100	[thread overview]
Message-ID: <200911091526.02147.rjw@sisk.pl> (raw)
In-Reply-To: <alpine.LFD.2.00.0911091452300.2725@localhost.localdomain>

On Monday 09 November 2009, Thomas Gleixner wrote:
> On Mon, 9 Nov 2009, Ingo Molnar wrote:
> > 
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 
> > > On Monday 09 November 2009, Ingo Molnar wrote:
> > > > 
> > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > > 
> > > > > [ 2016.865041] BUG: using smp_processor_id() in preemptible [00000000] code: events/1/29920                                                                                         
> > > > > [ 2016.865344] caller is vmstat_update+0x13/0x48                                          
> > > > > [ 2016.865522] Pid: 29920, comm: events/1 Not tainted 2.6.31-tst #158                     
> > > > > [ 2016.865700] Call Trace:                                                                
> > > > > [ 2016.865877]  [<ffffffff811608e8>] debug_smp_processor_id+0xc4/0xd4                     
> > > > > [ 2016.866052]  [<ffffffff810a9ae1>] vmstat_update+0x13/0x48                              
> > > > > [ 2016.866232]  [<ffffffff81051ee6>] worker_thread+0x18b/0x22a                            
> > > > > [ 2016.866409]  [<ffffffff810a9ace>] ? vmstat_update+0x0/0x48                             
> > > > > [ 2016.866578]  [<ffffffff810556a5>] ? autoremove_wake_function+0x0/0x38                  
> > > > > [ 2016.866749]  [<ffffffff81288803>] ? _spin_unlock_irqrestore+0x35/0x37                  
> > > > > [ 2016.866935]  [<ffffffff81051d5b>] ? worker_thread+0x0/0x22a                            
> > > > > [ 2016.867113]  [<ffffffff8105547d>] kthread+0x69/0x71                                    
> > > > > [ 2016.867278]  [<ffffffff8100c1aa>] child_rip+0xa/0x20                                   
> > > > > [ 2016.867450]  [<ffffffff81055414>] ? kthread+0x0/0x71                                   
> > > > > [ 2016.867618]  [<ffffffff8100c1a0>] ? child_rip+0x0/0x20                                 
> > > > 
> > > > a bug producing similar looking messages was fixed by:
> > > > 
> > > >   fd21073: sched: Fix affinity logic in select_task_rq_fair()
> > > > 
> > > > but that bug was introduced by:
> > > > 
> > > >   a1f84a3: sched: Check for an idle shared cache in select_task_rq_fair()
> > > 
> > > I guess these are tip commits?
> > 
> > yep, tip:sched/core ones.
> > 
> > > > Which is for v2.6.33, not v2.6.32.
> > > 
> > > The one I saw was in the Linus' tree, quite obviously.
> > 
> > ok, then my observation should not apply.
> 
> I think it _IS_ releated because the worker_thread is CPU affine and
> the debug_smp_processor_id() check does:
> 
>     if (cpumask_equal(&current->cpus_allowed, cpumask_of(this_cpu)))
> 
> which prevents that usage of smp_processor_id() in ksoftirqd and
> keventd in preempt enabled regions is warned on.
> 
> We saw exaclty the same back trace with fd21073 (sched: Fix affinity
> logic in select_task_rq_fair()).
> 
> Rafael, can you please add a printk to debug_smp_processor_id() so we
> can see on which CPU we are running ? I suspect we are on the wrong
> one.

Well, I can add the printk(), but I can't guarantee that I will get the call
trace once again.  So far I've seen it only once after 20-25 consecutive
suspend-resume cycles, so ...  you get the idea.

However, running on a wrong CPU would very nicely explain all of the observed
symptoms, so I guess we can try a House M.D.-alike approach and assume that the
answer is "yes, we're running on the wrong CPU".  What would we do next if that
was the case?

Rafael

  parent reply	other threads:[~2009-11-09 14:24 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-09 11:50 Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Rafael J. Wysocki
2009-11-09 12:02 ` Ingo Molnar
2009-11-09 12:24   ` Rafael J. Wysocki
2009-11-09 12:49     ` Ingo Molnar
2009-11-09 14:02       ` Thomas Gleixner
2009-11-09 14:16         ` Mike Galbraith
2009-11-09 14:27           ` Rafael J. Wysocki
2009-11-09 14:30             ` Mike Galbraith
2009-11-09 15:47               ` Rafael J. Wysocki
2009-11-09 16:19                 ` Mike Galbraith
2009-11-09 17:36                   ` Rafael J. Wysocki
2009-11-09 18:50                     ` Thomas Gleixner
2009-11-09 20:00                       ` Rafael J. Wysocki
2009-11-09 20:31                         ` [linux-pm] " Alan Stern
2009-11-09 20:48                           ` Rafael J. Wysocki
2009-11-09 21:24                             ` Alan Stern
2009-11-09 20:45                         ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-09 21:42                           ` Linus Torvalds
2009-11-10  0:19                             ` Rafael J. Wysocki
2009-11-10 22:02                               ` Linus Torvalds
2009-11-11  8:08                                 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-11 18:13                                   ` Oleg Nesterov
2009-11-12  4:56                                     ` Tejun Heo
2009-11-12 18:35                                       ` Oleg Nesterov
2009-11-12 19:14                                         ` Tejun Heo
2009-11-16 11:01                                           ` Tejun Heo
2009-11-11 11:52                                 ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-11 19:52                                   ` Linus Torvalds
2009-11-11 20:18                                     ` Marcel Holtmann
2009-11-11 20:25                                       ` Linus Torvalds
2009-11-11 21:18                                         ` Rafael J. Wysocki
2009-11-11 21:13                                       ` Oliver Neukum
2009-11-11 21:38                                         ` Linus Torvalds
2009-11-11 21:44                                           ` Oliver Neukum
2009-11-11 16:13                                 ` Oleg Nesterov
2009-11-11 20:00                                   ` Rafael J. Wysocki
2009-11-11 20:11                                     ` Linus Torvalds
2009-11-11 20:20                                       ` Marcel Holtmann
2009-11-11 20:24                                     ` Oleg Nesterov
2009-11-11 21:15                                       ` Oliver Neukum
2009-11-11 17:17                                 ` Oleg Nesterov
2009-11-12 17:33                                   ` Thomas Gleixner
2009-11-12 19:17                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume Tejun Heo
2009-11-12 20:53                                       ` Thomas Gleixner
2009-11-12 20:53                                     ` GPF in run_workqueue()/list_del_init(cwq->worklist.next) on resume (was: Re: Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd) Rafael J. Wysocki
2009-11-12 20:55                                       ` Thomas Gleixner
2009-11-12 22:55                                         ` Rafael J. Wysocki
2009-11-12 23:08                                           ` Thomas Gleixner
2009-11-15 23:37                                     ` Frederic Weisbecker
2009-11-15 23:40                                       ` Frederic Weisbecker
2009-11-09 19:13                     ` Help needed: Resume problems in 2.6.32-rc, perhaps related to preempt_count leakage in keventd Thomas Gleixner
2009-11-09 20:03                       ` Rafael J. Wysocki
2009-11-09 14:26         ` Rafael J. Wysocki [this message]
2009-11-09 14:44           ` Mike Galbraith
2009-11-09 15:47             ` Rafael J. Wysocki
2009-11-09 15:57         ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200911091526.02147.rjw@sisk.pl \
    --to=rjw@sisk.pl \
    --cc=gregkh@suse.de \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox