[Linux 2.6.29-rc2] BUG: using smp_processor

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
@ 2009-01-17  8:58 Maciej Rutecki
  2009-01-25  2:37 ` Andrew Morton
  2009-01-26 16:07 ` Linus Torvalds
  0 siblings, 2 replies; 19+ messages in thread
From: Maciej Rutecki @ 2009-01-17  8:58 UTC (permalink / raw)
  To: Linus Torvalds, Rafael J. Wysocki; +Cc: Linux Kernel Mailing List

During suspend to ram:
[  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
code: suspend_to_ram./2958
[  131.287012] caller is retrigger_next_event+0x13/0xb0
[  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
[  131.287012] Call Trace:
[  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
[  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
[  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
[  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
[  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
[  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
[  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
[  131.287012]  [<c0159c7b>] suspend_devices_and_enter+0xeb/0x160
[  131.287012]  [<c015a47f>] freeze_processes+0x4f/0xa0
[  131.287012]  [<c0159e3d>] enter_state+0xfd/0x150
[  131.287012]  [<c0159f1c>] state_store+0x8c/0xd0
[  131.287012]  [<c0159e90>] state_store+0x0/0xd0
[  131.287012]  [<c0250d54>] kobj_attr_store+0x24/0x30
[  131.287012]  [<c01e25d1>] sysfs_write_file+0xa1/0x110
[  131.287012]  [<c01e2530>] sysfs_write_file+0x0/0x110
[  131.287012]  [<c019bf48>] vfs_write+0xa8/0x140
[  131.287012]  [<c019c0b1>] sys_write+0x41/0x80
[  131.287012]  [<c010348d>] sysenter_do_call+0x12/0x21
[  131.287012]  [<c03b0000>] init_cyrix+0x72/0x444
[  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
code: suspend_to_ram./2958
[  131.287012] caller is retrigger_next_event+0x65/0xb0
[  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
[  131.287012] Call Trace:
[  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
[  131.287012]  [<c0147405>] retrigger_next_event+0x65/0xb0
[  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
[  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
[  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
[  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
[  131.287012]  [<c0159c7b>] suspend_devices_and_enter+0xeb/0x160
[  131.287012]  [<c015a47f>] freeze_processes+0x4f/0xa0
[  131.287012]  [<c0159e3d>] enter_state+0xfd/0x150
[  131.287012]  [<c0159f1c>] state_store+0x8c/0xd0
[  131.287012]  [<c0159e90>] state_store+0x0/0xd0
[  131.287012]  [<c0250d54>] kobj_attr_store+0x24/0x30
[  131.287012]  [<c01e25d1>] sysfs_write_file+0xa1/0x110
[  131.287012]  [<c01e2530>] sysfs_write_file+0x0/0x110
[  131.287012]  [<c019bf48>] vfs_write+0xa8/0x140
[  131.287012]  [<c019c0b1>] sys_write+0x41/0x80
[  131.287012]  [<c010348d>] sysenter_do_call+0x12/0x21
[  131.287012]  [<c03b0000>] init_cyrix+0x72/0x444

After this system works OK.
config, dmesg:
http://www.unixy.pl/maciek/download/kernel/2.6.29-rc2/pc/
-- 
Maciej Rutecki
http://www.maciek.unixy.pl

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-17  8:58 [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible Maciej Rutecki
@ 2009-01-25  2:37 ` Andrew Morton
  2009-01-25 11:57   ` Thomas Gleixner
                     ` (2 more replies)
  2009-01-26 16:07 ` Linus Torvalds
  1 sibling, 3 replies; 19+ messages in thread
From: Andrew Morton @ 2009-01-25  2:37 UTC (permalink / raw)
  To: Maciej Rutecki
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Ingo Molnar, Thomas Gleixner, Rusty Russell

On Sat, 17 Jan 2009 09:58:47 +0100 "Maciej Rutecki" <maciej.rutecki@gmail.com> wrote:

> During suspend to ram:
> [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> code: suspend_to_ram./2958
> [  131.287012] caller is retrigger_next_event+0x13/0xb0
> [  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
> [  131.287012] Call Trace:
> [  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
> [  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
> [  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
> [  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
> [  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
> [  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
> [  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
> [  131.287012]  [<c0159c7b>] suspend_devices_and_enter+0xeb/0x160
> [  131.287012]  [<c015a47f>] freeze_processes+0x4f/0xa0
> [  131.287012]  [<c0159e3d>] enter_state+0xfd/0x150
> [  131.287012]  [<c0159f1c>] state_store+0x8c/0xd0
> [  131.287012]  [<c0159e90>] state_store+0x0/0xd0
> [  131.287012]  [<c0250d54>] kobj_attr_store+0x24/0x30
> [  131.287012]  [<c01e25d1>] sysfs_write_file+0xa1/0x110
> [  131.287012]  [<c01e2530>] sysfs_write_file+0x0/0x110
> [  131.287012]  [<c019bf48>] vfs_write+0xa8/0x140
> [  131.287012]  [<c019c0b1>] sys_write+0x41/0x80
> [  131.287012]  [<c010348d>] sysenter_do_call+0x12/0x21
> [  131.287012]  [<c03b0000>] init_cyrix+0x72/0x444
> [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> code: suspend_to_ram./2958

Coming out of the hrtimer_hres_active() call, I assume.

Is this new behaviour, or were 2.6.28 and earlier doing it?

If it's new behaviour then I'm not sure why, but it triggers my Rusty
nerve.

afaict we should

a) remove the wrong "Called with interrupts disabled via
   on_each_cpu()" comment from retrigger_next_event(), and replace it
   with something useful and truthful.

b) change hres_timers_resume() to do local_irq_save/restore


Does this fix it?

--- a/kernel/hrtimer.c~a
+++ a/kernel/hrtimer.c
@@ -562,7 +562,7 @@ static int hrtimer_reprogram(struct hrti
 /*
  * Retrigger next event is called after clock was set
  *
- * Called with interrupts disabled via on_each_cpu()
+ * Called with interrupts disabled.
  */
 static void retrigger_next_event(void *arg)
 {
@@ -614,8 +614,12 @@ void clock_was_set(void)
  */
 void hres_timers_resume(void)
 {
+	unsigned long flags;
+
 	/* Retrigger the CPU local events: */
+	local_irq_save(flags);
 	retrigger_next_event(NULL);
+	local_irq_restore(flags);
 }
 
 /*
_


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-25  2:37 ` Andrew Morton
@ 2009-01-25 11:57   ` Thomas Gleixner
  2009-01-25 14:19   ` Maciej Rutecki
  2009-01-25 16:30   ` Maciej Rutecki
  2 siblings, 0 replies; 19+ messages in thread
From: Thomas Gleixner @ 2009-01-25 11:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Maciej Rutecki, Linus Torvalds, Rafael J. Wysocki,
	Linux Kernel Mailing List, Ingo Molnar, Rusty Russell

On Sat, 24 Jan 2009, Andrew Morton wrote:
> 
> Coming out of the hrtimer_hres_active() call, I assume.

Yes.
 
> Is this new behaviour, or were 2.6.28 and earlier doing it?

This code is there since highres was merged, but something seems to
have changed in the suspend/resume code as I'm pretty sure, that this
code was always called with interrupts disabled.
 
> a) remove the wrong "Called with interrupts disabled via
>    on_each_cpu()" comment from retrigger_next_event(), and replace it
>    with something useful and truthful.

Good point.
 
> b) change hres_timers_resume() to do local_irq_save/restore

Right, but I'd like to know what changed below.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-25  2:37 ` Andrew Morton
  2009-01-25 11:57   ` Thomas Gleixner
@ 2009-01-25 14:19   ` Maciej Rutecki
  2009-01-25 16:30   ` Maciej Rutecki
  2 siblings, 0 replies; 19+ messages in thread
From: Maciej Rutecki @ 2009-01-25 14:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Ingo Molnar, Thomas Gleixner, Rusty Russell

2009/1/25 Andrew Morton <akpm@linux-foundation.org>:

>
> Coming out of the hrtimer_hres_active() call, I assume.
>
> Is this new behaviour, or were 2.6.28 and earlier doing it?

2.6.28 and 2.6.28-rcx works OK. 2.6.29-rc1 hangs on resume, so I
cannot check it.



>
> Does this fix it?

I will check it soon.


-- 
Maciej Rutecki
http://www.maciek.unixy.pl

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-25  2:37 ` Andrew Morton
  2009-01-25 11:57   ` Thomas Gleixner
  2009-01-25 14:19   ` Maciej Rutecki
@ 2009-01-25 16:30   ` Maciej Rutecki
  2 siblings, 0 replies; 19+ messages in thread
From: Maciej Rutecki @ 2009-01-25 16:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Rafael J. Wysocki, Linux Kernel Mailing List,
	Ingo Molnar, Thomas Gleixner, Rusty Russell

2009/1/25 Andrew Morton <akpm@linux-foundation.org>:
[...]
>
>
> Does this fix it?

Yes, message has gone. Thanks for help.

Tested-by Maciej Rutecki <maciej.rutecki@gmail.com>
>
> --- a/kernel/hrtimer.c~a
> +++ a/kernel/hrtimer.c
> @@ -562,7 +562,7 @@ static int hrtimer_reprogram(struct hrti
>  /*
>  * Retrigger next event is called after clock was set
>  *
> - * Called with interrupts disabled via on_each_cpu()
> + * Called with interrupts disabled.
>  */
>  static void retrigger_next_event(void *arg)
>  {
> @@ -614,8 +614,12 @@ void clock_was_set(void)
>  */
>  void hres_timers_resume(void)
>  {
> +       unsigned long flags;
> +
>        /* Retrigger the CPU local events: */
> +       local_irq_save(flags);
>        retrigger_next_event(NULL);
> +       local_irq_restore(flags);
>  }
>
>  /*
> _
>
>



-- 
Maciej Rutecki
http://www.maciek.unixy.pl

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-17  8:58 [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible Maciej Rutecki
  2009-01-25  2:37 ` Andrew Morton
@ 2009-01-26 16:07 ` Linus Torvalds
  2009-01-26 16:53   ` Rafael J. Wysocki
  1 sibling, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2009-01-26 16:07 UTC (permalink / raw)
  To: Maciej Rutecki
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner



On Sat, 17 Jan 2009, Maciej Rutecki wrote:
>
> During suspend to ram:
> [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> code: suspend_to_ram./2958
> [  131.287012] caller is retrigger_next_event+0x13/0xb0
> [  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
> [  131.287012] Call Trace:
> [  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
> [  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
> [  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
> [  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
> [  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
> [  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
> [  131.287012]  [<c02d2478>] device_power_up+0x8/0x10

Very scary.

device_power_up() calls sysdev_resume _before_ it enables interrupts so it 
sounds like something else has - very incorrectly - enabled interrupts too 
early in your resume sequence. 

The patch that Andrew sent out and that apparently fixed things for you 
should absolutely not have made any difference. This is suspend_enter():

	        arch_suspend_disable_irqs();
	        BUG_ON(!irqs_disabled());

	        if ((error = device_power_down(PMSG_SUSPEND))) {
	                printk(KERN_ERR "PM: Some devices failed to power down\n");
	                goto Done;
	        }
	
	        if (!suspend_test(TEST_CORE))
	                error = suspend_ops->enter(state);

	        device_power_up(PMSG_RESUME);
	 Done:
	        arch_suspend_enable_irqs();

and notice how the whole thing is surrounded by that 
arch_suspend_disable/enable_irqs().

So it looks like some sysdev driver (device_power_up does the sysdev 
drivers first, so it can't be the regular low-level PCI drivers) is 
enabling interrupts in its resume function. Scary and very wrong.

It could easily be ACPI, of course. There was some other case where ACPI 
did that, iirc.

		Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-26 16:07 ` Linus Torvalds
@ 2009-01-26 16:53   ` Rafael J. Wysocki
  2009-01-26 17:41     ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2009-01-26 16:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner

On Monday 26 January 2009, Linus Torvalds wrote:
> 
> On Sat, 17 Jan 2009, Maciej Rutecki wrote:
> >
> > During suspend to ram:
> > [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> > code: suspend_to_ram./2958
> > [  131.287012] caller is retrigger_next_event+0x13/0xb0
> > [  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
> > [  131.287012] Call Trace:
> > [  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
> > [  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
> > [  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
> > [  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
> > [  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
> > [  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
> > [  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
> 
> Very scary.
> 
> device_power_up() calls sysdev_resume _before_ it enables interrupts so it 
> sounds like something else has - very incorrectly - enabled interrupts too 
> early in your resume sequence. 
> 
> The patch that Andrew sent out and that apparently fixed things for you 
> should absolutely not have made any difference. This is suspend_enter():
> 
> 	        arch_suspend_disable_irqs();
> 	        BUG_ON(!irqs_disabled());
> 
> 	        if ((error = device_power_down(PMSG_SUSPEND))) {
> 	                printk(KERN_ERR "PM: Some devices failed to power down\n");
> 	                goto Done;
> 	        }
> 	
> 	        if (!suspend_test(TEST_CORE))
> 	                error = suspend_ops->enter(state);
> 
> 	        device_power_up(PMSG_RESUME);
> 	 Done:
> 	        arch_suspend_enable_irqs();
> 
> and notice how the whole thing is surrounded by that 
> arch_suspend_disable/enable_irqs().
> 
> So it looks like some sysdev driver (device_power_up does the sysdev 
> drivers first, so it can't be the regular low-level PCI drivers) is 
> enabling interrupts in its resume function. Scary and very wrong.
> 
> It could easily be ACPI, of course. There was some other case where ACPI 
> did that, iirc.

There is a known bug in the USB controllers' suspend that enables interrupts
from within ->suspend_late().  It should be fixed by the next USB merge
AFAICS.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-26 16:53   ` Rafael J. Wysocki
@ 2009-01-26 17:41     ` Ingo Molnar
  2009-01-26 19:21       ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2009-01-26 17:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, Maciej Rutecki, Linux Kernel Mailing List,
	Andrew Morton, Thomas Gleixner


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Monday 26 January 2009, Linus Torvalds wrote:
> > 
> > On Sat, 17 Jan 2009, Maciej Rutecki wrote:
> > >
> > > During suspend to ram:
> > > [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> > > code: suspend_to_ram./2958
> > > [  131.287012] caller is retrigger_next_event+0x13/0xb0
> > > [  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
> > > [  131.287012] Call Trace:
> > > [  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
> > > [  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
> > > [  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
> > > [  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
> > > [  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
> > > [  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
> > > [  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
> > 
> > Very scary.
> > 
> > device_power_up() calls sysdev_resume _before_ it enables interrupts so it 
> > sounds like something else has - very incorrectly - enabled interrupts too 
> > early in your resume sequence. 
> > 
> > The patch that Andrew sent out and that apparently fixed things for you 
> > should absolutely not have made any difference. This is suspend_enter():
> > 
> > 	        arch_suspend_disable_irqs();
> > 	        BUG_ON(!irqs_disabled());
> > 
> > 	        if ((error = device_power_down(PMSG_SUSPEND))) {
> > 	                printk(KERN_ERR "PM: Some devices failed to power down\n");
> > 	                goto Done;
> > 	        }
> > 	
> > 	        if (!suspend_test(TEST_CORE))
> > 	                error = suspend_ops->enter(state);
> > 
> > 	        device_power_up(PMSG_RESUME);
> > 	 Done:
> > 	        arch_suspend_enable_irqs();
> > 
> > and notice how the whole thing is surrounded by that 
> > arch_suspend_disable/enable_irqs().
> > 
> > So it looks like some sysdev driver (device_power_up does the sysdev 
> > drivers first, so it can't be the regular low-level PCI drivers) is 
> > enabling interrupts in its resume function. Scary and very wrong.
> > 
> > It could easily be ACPI, of course. There was some other case where ACPI 
> > did that, iirc.
> 
> There is a known bug in the USB controllers' suspend that enables interrupts
> from within ->suspend_late().  It should be fixed by the next USB merge
> AFAICS.

the patch from Andrew looks wrong, as it hides the only place in the 
kernel that was able to report the resume bug. Nevertheless related to 
that bug we've got a new debug check queued up in timers/urgent:

 void hres_timers_resume(void)
 {
-       /* Retrigger the CPU local events: */
+       WARN_ONCE(!irqs_disabled(),
+                 KERN_INFO "hres_timers_resume() called with IRQs enabled!");
+
        retrigger_next_event(NULL);
 }

as the buggy 'irqs are enabled' condition was not detected reliably. (it 
was only detected with certain lockdep options turned on - and even then 
it did not seem to be 100% triggerable)

i sent it to Linus earlier today.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-26 17:41     ` Ingo Molnar
@ 2009-01-26 19:21       ` Rafael J. Wysocki
  2009-01-26 20:35         ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2009-01-26 19:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Maciej Rutecki, Linux Kernel Mailing List,
	Andrew Morton, Thomas Gleixner

On Monday 26 January 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > On Monday 26 January 2009, Linus Torvalds wrote:
> > > 
> > > On Sat, 17 Jan 2009, Maciej Rutecki wrote:
> > > >
> > > > During suspend to ram:
> > > > [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> > > > code: suspend_to_ram./2958
> > > > [  131.287012] caller is retrigger_next_event+0x13/0xb0
> > > > [  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
> > > > [  131.287012] Call Trace:
> > > > [  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
> > > > [  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
> > > > [  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
> > > > [  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
> > > > [  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
> > > > [  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
> > > > [  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
> > > 
> > > Very scary.
> > > 
> > > device_power_up() calls sysdev_resume _before_ it enables interrupts so it 
> > > sounds like something else has - very incorrectly - enabled interrupts too 
> > > early in your resume sequence. 
> > > 
> > > The patch that Andrew sent out and that apparently fixed things for you 
> > > should absolutely not have made any difference. This is suspend_enter():
> > > 
> > > 	        arch_suspend_disable_irqs();
> > > 	        BUG_ON(!irqs_disabled());
> > > 
> > > 	        if ((error = device_power_down(PMSG_SUSPEND))) {
> > > 	                printk(KERN_ERR "PM: Some devices failed to power down\n");
> > > 	                goto Done;
> > > 	        }
> > > 	
> > > 	        if (!suspend_test(TEST_CORE))
> > > 	                error = suspend_ops->enter(state);
> > > 
> > > 	        device_power_up(PMSG_RESUME);
> > > 	 Done:
> > > 	        arch_suspend_enable_irqs();
> > > 
> > > and notice how the whole thing is surrounded by that 
> > > arch_suspend_disable/enable_irqs().
> > > 
> > > So it looks like some sysdev driver (device_power_up does the sysdev 
> > > drivers first, so it can't be the regular low-level PCI drivers) is 
> > > enabling interrupts in its resume function. Scary and very wrong.
> > > 
> > > It could easily be ACPI, of course. There was some other case where ACPI 
> > > did that, iirc.
> > 
> > There is a known bug in the USB controllers' suspend that enables interrupts
> > from within ->suspend_late().  It should be fixed by the next USB merge
> > AFAICS.
> 
> the patch from Andrew looks wrong, as it hides the only place in the 
> kernel that was able to report the resume bug. Nevertheless related to 
> that bug we've got a new debug check queued up in timers/urgent:
> 
>  void hres_timers_resume(void)
>  {
> -       /* Retrigger the CPU local events: */
> +       WARN_ONCE(!irqs_disabled(),
> +                 KERN_INFO "hres_timers_resume() called with IRQs enabled!");
> +
>         retrigger_next_event(NULL);
>  }
> 
> as the buggy 'irqs are enabled' condition was not detected reliably. (it 
> was only detected with certain lockdep options turned on - and even then 
> it did not seem to be 100% triggerable)

Yeah.

> i sent it to Linus earlier today.

OK, thanks.

I'll write a debug patch covering that more generally when I recover from the
flu a bit.

Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-26 19:21       ` Rafael J. Wysocki
@ 2009-01-26 20:35         ` Ingo Molnar
  2009-01-26 20:48           ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2009-01-26 20:35 UTC (permalink / raw)
  To: Rafael J. Wysocki, Frédéric Weisbecker, Steven Rostedt
  Cc: Linus Torvalds, Maciej Rutecki, Linux Kernel Mailing List,
	Andrew Morton, Thomas Gleixner


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Monday 26 January 2009, Ingo Molnar wrote:
> > 
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 
> > > On Monday 26 January 2009, Linus Torvalds wrote:
> > > > 
> > > > On Sat, 17 Jan 2009, Maciej Rutecki wrote:
> > > > >
> > > > > During suspend to ram:
> > > > > [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> > > > > code: suspend_to_ram./2958
> > > > > [  131.287012] caller is retrigger_next_event+0x13/0xb0
> > > > > [  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
> > > > > [  131.287012] Call Trace:
> > > > > [  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
> > > > > [  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
> > > > > [  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
> > > > > [  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
> > > > > [  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
> > > > > [  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
> > > > > [  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
> > > > 
> > > > Very scary.
> > > > 
> > > > device_power_up() calls sysdev_resume _before_ it enables interrupts so it 
> > > > sounds like something else has - very incorrectly - enabled interrupts too 
> > > > early in your resume sequence. 
> > > > 
> > > > The patch that Andrew sent out and that apparently fixed things for you 
> > > > should absolutely not have made any difference. This is suspend_enter():
> > > > 
> > > > 	        arch_suspend_disable_irqs();
> > > > 	        BUG_ON(!irqs_disabled());
> > > > 
> > > > 	        if ((error = device_power_down(PMSG_SUSPEND))) {
> > > > 	                printk(KERN_ERR "PM: Some devices failed to power down\n");
> > > > 	                goto Done;
> > > > 	        }
> > > > 	
> > > > 	        if (!suspend_test(TEST_CORE))
> > > > 	                error = suspend_ops->enter(state);
> > > > 
> > > > 	        device_power_up(PMSG_RESUME);
> > > > 	 Done:
> > > > 	        arch_suspend_enable_irqs();
> > > > 
> > > > and notice how the whole thing is surrounded by that 
> > > > arch_suspend_disable/enable_irqs().
> > > > 
> > > > So it looks like some sysdev driver (device_power_up does the sysdev 
> > > > drivers first, so it can't be the regular low-level PCI drivers) is 
> > > > enabling interrupts in its resume function. Scary and very wrong.
> > > > 
> > > > It could easily be ACPI, of course. There was some other case where ACPI 
> > > > did that, iirc.
> > > 
> > > There is a known bug in the USB controllers' suspend that enables interrupts
> > > from within ->suspend_late().  It should be fixed by the next USB merge
> > > AFAICS.
> > 
> > the patch from Andrew looks wrong, as it hides the only place in the 
> > kernel that was able to report the resume bug. Nevertheless related to 
> > that bug we've got a new debug check queued up in timers/urgent:
> > 
> >  void hres_timers_resume(void)
> >  {
> > -       /* Retrigger the CPU local events: */
> > +       WARN_ONCE(!irqs_disabled(),
> > +                 KERN_INFO "hres_timers_resume() called with IRQs enabled!");
> > +
> >         retrigger_next_event(NULL);
> >  }
> > 
> > as the buggy 'irqs are enabled' condition was not detected reliably. (it 
> > was only detected with certain lockdep options turned on - and even then 
> > it did not seem to be 100% triggerable)
> 
> Yeah.
> 
> > i sent it to Linus earlier today.
> 
> OK, thanks.
> 
> I'll write a debug patch covering that more generally when I recover from the
> flu a bit.
> 
> Rafael

note, we could reuse the ftrace/irqtrace callbacks as well to create an: 
"enforce IRQs off and debug violations of that" facility.

It would work like this, you could mark IRQs as disabled 'permanently':

   force_irqs_off_start();
   ...
   force_irqs_off_end();

you could mark an arbitrarily complex code sequence that way, and ftrace 
would emit a WARN_ONCE() if irqs are enable anytime during that sequence - 
by using the irq-tracking facilities we have for the irqsoff tracer (and 
which we also have for lockdep).

Would that be useful?

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-26 20:35         ` Ingo Molnar
@ 2009-01-26 20:48           ` Rafael J. Wysocki
  2009-01-26 21:35             ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2009-01-26 20:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner

On Monday 26 January 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > On Monday 26 January 2009, Ingo Molnar wrote:
> > > 
> > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > 
> > > > On Monday 26 January 2009, Linus Torvalds wrote:
> > > > > 
> > > > > On Sat, 17 Jan 2009, Maciej Rutecki wrote:
> > > > > >
> > > > > > During suspend to ram:
> > > > > > [  131.287012] BUG: using smp_processor_id() in preemptible [00000000]
> > > > > > code: suspend_to_ram./2958
> > > > > > [  131.287012] caller is retrigger_next_event+0x13/0xb0
> > > > > > [  131.287012] Pid: 2958, comm: suspend_to_ram. Not tainted 2.6.29-rc2 #1
> > > > > > [  131.287012] Call Trace:
> > > > > > [  131.287012]  [<c025b41f>] debug_smp_processor_id+0xbf/0xd0
> > > > > > [  131.287012]  [<c01473b3>] retrigger_next_event+0x13/0xb0
> > > > > > [  131.287012]  [<c01489b7>] raw_notifier_call_chain+0x17/0x20
> > > > > > [  131.287012]  [<c014b938>] timekeeping_resume+0xe8/0x110
> > > > > > [  131.287012]  [<c02cc651>] __sysdev_resume+0x11/0x50
> > > > > > [  131.287012]  [<c02cc6d7>] sysdev_resume+0x47/0x80
> > > > > > [  131.287012]  [<c02d2478>] device_power_up+0x8/0x10
> > > > > 
> > > > > Very scary.
> > > > > 
> > > > > device_power_up() calls sysdev_resume _before_ it enables interrupts so it 
> > > > > sounds like something else has - very incorrectly - enabled interrupts too 
> > > > > early in your resume sequence. 
> > > > > 
> > > > > The patch that Andrew sent out and that apparently fixed things for you 
> > > > > should absolutely not have made any difference. This is suspend_enter():
> > > > > 
> > > > > 	        arch_suspend_disable_irqs();
> > > > > 	        BUG_ON(!irqs_disabled());
> > > > > 
> > > > > 	        if ((error = device_power_down(PMSG_SUSPEND))) {
> > > > > 	                printk(KERN_ERR "PM: Some devices failed to power down\n");
> > > > > 	                goto Done;
> > > > > 	        }
> > > > > 	
> > > > > 	        if (!suspend_test(TEST_CORE))
> > > > > 	                error = suspend_ops->enter(state);
> > > > > 
> > > > > 	        device_power_up(PMSG_RESUME);
> > > > > 	 Done:
> > > > > 	        arch_suspend_enable_irqs();
> > > > > 
> > > > > and notice how the whole thing is surrounded by that 
> > > > > arch_suspend_disable/enable_irqs().
> > > > > 
> > > > > So it looks like some sysdev driver (device_power_up does the sysdev 
> > > > > drivers first, so it can't be the regular low-level PCI drivers) is 
> > > > > enabling interrupts in its resume function. Scary and very wrong.
> > > > > 
> > > > > It could easily be ACPI, of course. There was some other case where ACPI 
> > > > > did that, iirc.
> > > > 
> > > > There is a known bug in the USB controllers' suspend that enables interrupts
> > > > from within ->suspend_late().  It should be fixed by the next USB merge
> > > > AFAICS.
> > > 
> > > the patch from Andrew looks wrong, as it hides the only place in the 
> > > kernel that was able to report the resume bug. Nevertheless related to 
> > > that bug we've got a new debug check queued up in timers/urgent:
> > > 
> > >  void hres_timers_resume(void)
> > >  {
> > > -       /* Retrigger the CPU local events: */
> > > +       WARN_ONCE(!irqs_disabled(),
> > > +                 KERN_INFO "hres_timers_resume() called with IRQs enabled!");
> > > +
> > >         retrigger_next_event(NULL);
> > >  }
> > > 
> > > as the buggy 'irqs are enabled' condition was not detected reliably. (it 
> > > was only detected with certain lockdep options turned on - and even then 
> > > it did not seem to be 100% triggerable)
> > 
> > Yeah.
> > 
> > > i sent it to Linus earlier today.
> > 
> > OK, thanks.
> > 
> > I'll write a debug patch covering that more generally when I recover from the
> > flu a bit.
> > 
> > Rafael
> 
> note, we could reuse the ftrace/irqtrace callbacks as well to create an: 
> "enforce IRQs off and debug violations of that" facility.

Well, do we have to? ;-)

> It would work like this, you could mark IRQs as disabled 'permanently':
> 
>    force_irqs_off_start();
>    ...
>    force_irqs_off_end();
> 
> you could mark an arbitrarily complex code sequence that way, and ftrace 
> would emit a WARN_ONCE() if irqs are enable anytime during that sequence - 
> by using the irq-tracking facilities we have for the irqsoff tracer (and 
> which we also have for lockdep).
> 
> Would that be useful?

Not sure, I only know a little about ftrace, I really can't judge.

Anyway, I think that putting the checks directly into the code path in question
would be more reliable and would still work without ftrace.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-26 20:48           ` Rafael J. Wysocki
@ 2009-01-26 21:35             ` Ingo Molnar
  2009-01-27 15:28               ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2009-01-26 21:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Monday 26 January 2009, Ingo Molnar wrote:
[...]
> > It would work like this, you could mark IRQs as disabled 'permanently':
> > 
> >    force_irqs_off_start();
> >    ...
> >    force_irqs_off_end();
> > 
> > you could mark an arbitrarily complex code sequence that way, and ftrace 
> > would emit a WARN_ONCE() if irqs are enable anytime during that sequence - 
> > by using the irq-tracking facilities we have for the irqsoff tracer (and 
> > which we also have for lockdep).
> > 
> > Would that be useful?
> 
> Not sure, I only know a little about ftrace, I really can't judge.

The instrumentation is really simple, see kernel/tracing/trace_irqsoff.c:

we call this function if hardirqs are disabled anywhere in the kernel:

 static inline void
 start_critical_timing(unsigned long ip, unsigned long parent_ip)

and we call this function if hardirqs are enabled anywhere in the kernel:

 static inline void
 stop_critical_timing(unsigned long ip, unsigned long parent_ip)

that's all. You need a single line check in stop_critical_timing(), 
something like this:

   WARN_ON_ONCE(per_cpu(hardirqs_forced_off, this_cpu));

> Anyway, I think that putting the checks directly into the code path in 
> question would be more reliable and would still work without ftrace.

More reliable than a WARN() triggering right at the buggy place that 
erroneously enables IRQs? Regardless of how obscurely it's done - whether 
it's a side effect of something, etc. etc.?

With such a generic facility you'd not have to put in any explicit checks 
anywhere _at all_.

In fact whatever check you put in it's _always_ going to be fundamentally 
more fragile than direct instrumentation: you cannot possibly check all 
possible places that enable interrupts. (they could be disabling 
interrupts as a _restore_irqs() sequence for example)

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-26 21:35             ` Ingo Molnar
@ 2009-01-27 15:28               ` Rafael J. Wysocki
  2009-01-27 15:49                 ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2009-01-27 15:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner

On Monday 26 January 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > On Monday 26 January 2009, Ingo Molnar wrote:
> [...]
> > > It would work like this, you could mark IRQs as disabled 'permanently':
> > > 
> > >    force_irqs_off_start();
> > >    ...
> > >    force_irqs_off_end();
> > > 
> > > you could mark an arbitrarily complex code sequence that way, and ftrace 
> > > would emit a WARN_ONCE() if irqs are enable anytime during that sequence - 
> > > by using the irq-tracking facilities we have for the irqsoff tracer (and 
> > > which we also have for lockdep).
> > > 
> > > Would that be useful?
> > 
> > Not sure, I only know a little about ftrace, I really can't judge.
> 
> The instrumentation is really simple, see kernel/tracing/trace_irqsoff.c:
> 
> we call this function if hardirqs are disabled anywhere in the kernel:
> 
>  static inline void
>  start_critical_timing(unsigned long ip, unsigned long parent_ip)
> 
> and we call this function if hardirqs are enabled anywhere in the kernel:
> 
>  static inline void
>  stop_critical_timing(unsigned long ip, unsigned long parent_ip)
> 
> that's all. You need a single line check in stop_critical_timing(), 
> something like this:
> 
>    WARN_ON_ONCE(per_cpu(hardirqs_forced_off, this_cpu));
>
> > Anyway, I think that putting the checks directly into the code path in 
> > question would be more reliable and would still work without ftrace.
> 
> More reliable than a WARN() triggering right at the buggy place that 
> erroneously enables IRQs? Regardless of how obscurely it's done - whether 
> it's a side effect of something, etc. etc.?

Actually, yes.
 
> With such a generic facility you'd not have to put in any explicit checks 
> anywhere _at all_.

That's under the assumption that the traces we get are always useful.

> In fact whatever check you put in it's _always_ going to be fundamentally 
> more fragile than direct instrumentation: you cannot possibly check all 
> possible places that enable interrupts. (they could be disabling 
> interrupts as a _restore_irqs() sequence for example)

In this particular case, I'm not really interested in that.  What I'm
interested in is which driver's ->suspend_late() or ->resume_early() (or the
equivalents for sysdevs) has enabled interrupts, which is quite easy to
check directly.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-27 15:28               ` Rafael J. Wysocki
@ 2009-01-27 15:49                 ` Ingo Molnar
  2009-01-27 21:18                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2009-01-27 15:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> > In fact whatever check you put in it's _always_ going to be 
> > fundamentally more fragile than direct instrumentation: you cannot 
> > possibly check all possible places that enable interrupts. (they could 
> > be disabling interrupts as a _restore_irqs() sequence for example)
> 
> In this particular case, I'm not really interested in that.  What I'm 
> interested in is which driver's ->suspend_late() or ->resume_early() (or 
> the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> to check directly.

But this is exactly what it does - without any need for debug checks 
spread around!

You'll get a _full stack dump_ from the very driver that is enabling 
interrupts! You dont get a trace - you get a stack dump of the very place 
that is buggy. It does not get any better than that.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-27 15:49                 ` Ingo Molnar
@ 2009-01-27 21:18                   ` Rafael J. Wysocki
  2009-01-29 15:07                     ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2009-01-27 21:18 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner

On Tuesday 27 January 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > > In fact whatever check you put in it's _always_ going to be 
> > > fundamentally more fragile than direct instrumentation: you cannot 
> > > possibly check all possible places that enable interrupts. (they could 
> > > be disabling interrupts as a _restore_irqs() sequence for example)
> > 
> > In this particular case, I'm not really interested in that.  What I'm 
> > interested in is which driver's ->suspend_late() or ->resume_early() (or 
> > the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> > to check directly.
> 
> But this is exactly what it does - without any need for debug checks 
> spread around!
> 
> You'll get a _full stack dump_ from the very driver that is enabling 
> interrupts! You dont get a trace - you get a stack dump of the very place 
> that is buggy. It does not get any better than that.

I'm not going to argue.

Nevertheless, IMO something like the patch below should be sufficient to catch
these bugs.

Thanks,
Rafael


---
 drivers/base/power/main.c |   12 ++++++++++++
 drivers/base/sys.c        |   21 ++++++++++++++++-----
 include/linux/pm.h        |   18 ++++++++++++++++++
 3 files changed, 46 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/base/power/main.c
===================================================================
--- linux-2.6.orig/drivers/base/power/main.c
+++ linux-2.6/drivers/base/power/main.c
@@ -186,12 +186,14 @@ static int pm_noirq_op(struct device *de
 		if (ops->suspend_noirq) {
 			error = ops->suspend_noirq(dev);
 			suspend_report_result(ops->suspend_noirq, error);
+			pm_check_irqs_after(ops->suspend_noirq);
 		}
 		break;
 	case PM_EVENT_RESUME:
 		if (ops->resume_noirq) {
 			error = ops->resume_noirq(dev);
 			suspend_report_result(ops->resume_noirq, error);
+			pm_check_irqs_after(ops->resume_noirq);
 		}
 		break;
 #endif /* CONFIG_SUSPEND */
@@ -201,12 +203,14 @@ static int pm_noirq_op(struct device *de
 		if (ops->freeze_noirq) {
 			error = ops->freeze_noirq(dev);
 			suspend_report_result(ops->freeze_noirq, error);
+			pm_check_irqs_after(ops->freeze_noirq);
 		}
 		break;
 	case PM_EVENT_HIBERNATE:
 		if (ops->poweroff_noirq) {
 			error = ops->poweroff_noirq(dev);
 			suspend_report_result(ops->poweroff_noirq, error);
+			pm_check_irqs_after(ops->poweroff_noirq);
 		}
 		break;
 	case PM_EVENT_THAW:
@@ -214,12 +218,14 @@ static int pm_noirq_op(struct device *de
 		if (ops->thaw_noirq) {
 			error = ops->thaw_noirq(dev);
 			suspend_report_result(ops->thaw_noirq, error);
+			pm_check_irqs_after(ops->thaw_noirq);
 		}
 		break;
 	case PM_EVENT_RESTORE:
 		if (ops->restore_noirq) {
 			error = ops->restore_noirq(dev);
 			suspend_report_result(ops->restore_noirq, error);
+			pm_check_irqs_after(ops->restore_noirq);
 		}
 		break;
 #endif /* CONFIG_HIBERNATION */
@@ -292,6 +298,7 @@ static int resume_device_noirq(struct de
 	} else if (dev->bus->resume_early) {
 		pm_dev_dbg(dev, state, "legacy EARLY ");
 		error = dev->bus->resume_early(dev);
+		pm_check_irqs_after(dev->bus->resume_early);
 	}
  End:
 	TRACE_RESUME(error);
@@ -311,6 +318,8 @@ static void dpm_power_up(pm_message_t st
 {
 	struct device *dev;
 
+	pm_check_irqs();
+
 	list_for_each_entry(dev, &dpm_list, power.entry)
 		if (dev->power.status > DPM_OFF) {
 			int error;
@@ -551,6 +560,7 @@ static int suspend_device_noirq(struct d
 		pm_dev_dbg(dev, state, "legacy LATE ");
 		error = dev->bus->suspend_late(dev, state);
 		suspend_report_result(dev->bus->suspend_late, error);
+		pm_check_irqs_after(dev->bus->suspend_late);
 	}
 	return error;
 }
@@ -569,6 +579,8 @@ int device_power_down(pm_message_t state
 	struct device *dev;
 	int error = 0;
 
+	pm_check_irqs();
+
 	list_for_each_entry_reverse(dev, &dpm_list, power.entry) {
 		error = suspend_device_noirq(dev, state);
 		if (error) {
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include/linux/pm.h
+++ linux-2.6/include/linux/pm.h
@@ -396,6 +396,20 @@ extern void __suspend_report_result(cons
 		__suspend_report_result(__func__, fn, ret);		\
 	} while (0)
 
+#define pm_check_irqs()							\
+	do {								\
+		WARN_ONCE(!irqs_disabled(),				\
+			"PM: Interrupts enabled in %s\n",		\
+			__FUNCTION__);					\
+	} while (0)
+
+#define pm_check_irqs_after(callback)					\
+	do {								\
+		WARN_ONCE(!irqs_disabled(),				\
+			"PM: Interrupts enabled by %pF\n",		\
+			(callback));					\
+	} while (0)
+
 #else /* !CONFIG_PM_SLEEP */
 
 static inline int device_suspend(pm_message_t state)
@@ -405,6 +419,10 @@ static inline int device_suspend(pm_mess
 
 #define suspend_report_result(fn, ret)		do {} while (0)
 
+#define pm_check_irqs()				do {} while (0)
+
+#define pm_check_irqs_after(callback)		do {} while (0)
+
 #endif /* !CONFIG_PM_SLEEP */
 
 /*
Index: linux-2.6/drivers/base/sys.c
===================================================================
--- linux-2.6.orig/drivers/base/sys.c
+++ linux-2.6/drivers/base/sys.c
@@ -341,14 +341,17 @@ static void __sysdev_resume(struct sys_d
 	struct sysdev_driver *drv;
 
 	/* First, call the class-specific one */
-	if (cls->resume)
+	if (cls->resume) {
 		cls->resume(dev);
+		pm_check_irqs_after(cls->resume);
+	}
 
 	/* Call auxillary drivers next. */
-	list_for_each_entry(drv, &cls->drivers, entry) {
-		if (drv->resume)
+	list_for_each_entry(drv, &cls->drivers, entry)
+		if (drv->resume) {
 			drv->resume(dev);
-	}
+			pm_check_irqs_after(drv->resume);
+		}
 }
 
 /**
@@ -373,6 +376,8 @@ int sysdev_suspend(pm_message_t state)
 
 	pr_debug("Suspending System Devices\n");
 
+	pm_check_irqs();
+
 	list_for_each_entry_reverse(cls, &system_kset->list, kset.kobj.entry) {
 		pr_debug("Suspending type '%s':\n",
 			 kobject_name(&cls->kset.kobj));
@@ -384,6 +389,7 @@ int sysdev_suspend(pm_message_t state)
 			list_for_each_entry(drv, &cls->drivers, entry) {
 				if (drv->suspend) {
 					ret = drv->suspend(sysdev, state);
+					pm_check_irqs_after(drv->suspend);
 					if (ret)
 						goto aux_driver;
 				}
@@ -392,6 +398,7 @@ int sysdev_suspend(pm_message_t state)
 			/* Now call the generic one */
 			if (cls->suspend) {
 				ret = cls->suspend(sysdev, state);
+				pm_check_irqs_after(cls->suspend);
 				if (ret)
 					goto cls_driver;
 			}
@@ -411,8 +418,10 @@ aux_driver:
 	list_for_each_entry(err_drv, &cls->drivers, entry) {
 		if (err_drv == drv)
 			break;
-		if (err_drv->resume)
+		if (err_drv->resume) {
 			err_drv->resume(sysdev);
+			pm_check_irqs_after(err_drv->resume);
+		}
 	}
 
 	/* resume other sysdevs in current class */
@@ -449,6 +458,8 @@ int sysdev_resume(void)
 
 	pr_debug("Resuming System Devices\n");
 
+	pm_check_irqs();
+
 	list_for_each_entry(cls, &system_kset->list, kset.kobj.entry) {
 		struct sys_device * sysdev;
 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-27 21:18                   ` Rafael J. Wysocki
@ 2009-01-29 15:07                     ` Ingo Molnar
  2009-01-29 22:29                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2009-01-29 15:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Tuesday 27 January 2009, Ingo Molnar wrote:
> > 
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 
> > > > In fact whatever check you put in it's _always_ going to be 
> > > > fundamentally more fragile than direct instrumentation: you cannot 
> > > > possibly check all possible places that enable interrupts. (they could 
> > > > be disabling interrupts as a _restore_irqs() sequence for example)
> > > 
> > > In this particular case, I'm not really interested in that.  What I'm 
> > > interested in is which driver's ->suspend_late() or ->resume_early() (or 
> > > the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> > > to check directly.
> > 
> > But this is exactly what it does - without any need for debug checks 
> > spread around!
> > 
> > You'll get a _full stack dump_ from the very driver that is enabling 
> > interrupts! You dont get a trace - you get a stack dump of the very place 
> > that is buggy. It does not get any better than that.
> 
> I'm not going to argue.
> 
> Nevertheless, IMO something like the patch below should be sufficient to catch
> these bugs.
> 
> Thanks,
> Rafael
> 
> 
> ---
>  drivers/base/power/main.c |   12 ++++++++++++
>  drivers/base/sys.c        |   21 ++++++++++++++++-----
>  include/linux/pm.h        |   18 ++++++++++++++++++
>  3 files changed, 46 insertions(+), 5 deletions(-)

hm, so now you sprinkle debug checks all around the code, instead of 
putting in a single pair of:

    force_irqs_off_start();
    ...
    force_irqs_off_end();

which would catch everything that your checks would catch - and it would 
catch more. In what way is your approach better?

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-29 15:07                     ` Ingo Molnar
@ 2009-01-29 22:29                       ` Rafael J. Wysocki
  2009-01-30 14:06                         ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2009-01-29 22:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner

On Thursday 29 January 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > On Tuesday 27 January 2009, Ingo Molnar wrote:
> > > 
> > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > 
> > > > > In fact whatever check you put in it's _always_ going to be 
> > > > > fundamentally more fragile than direct instrumentation: you cannot 
> > > > > possibly check all possible places that enable interrupts. (they could 
> > > > > be disabling interrupts as a _restore_irqs() sequence for example)
> > > > 
> > > > In this particular case, I'm not really interested in that.  What I'm 
> > > > interested in is which driver's ->suspend_late() or ->resume_early() (or 
> > > > the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> > > > to check directly.
> > > 
> > > But this is exactly what it does - without any need for debug checks 
> > > spread around!
> > > 
> > > You'll get a _full stack dump_ from the very driver that is enabling 
> > > interrupts! You dont get a trace - you get a stack dump of the very place 
> > > that is buggy. It does not get any better than that.
> > 
> > I'm not going to argue.
> > 
> > Nevertheless, IMO something like the patch below should be sufficient to catch
> > these bugs.
> > 
> > Thanks,
> > Rafael
> > 
> > 
> > ---
> >  drivers/base/power/main.c |   12 ++++++++++++
> >  drivers/base/sys.c        |   21 ++++++++++++++++-----
> >  include/linux/pm.h        |   18 ++++++++++++++++++
> >  3 files changed, 46 insertions(+), 5 deletions(-)
> 
> hm, so now you sprinkle debug checks all around the code, instead of 
> putting in a single pair of:
> 
>     force_irqs_off_start();
>     ...
>     force_irqs_off_end();

And what debug options exactly would that require to be set to work?

> which would catch everything that your checks would catch - and it would 
> catch more.

Except that the checks trigger in specific places, so if a check triggers you
know precisely where the bug happened regardless of what garbage is in the call
trace.

> In what way is your approach better?

That depends on the answer to my question above.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-29 22:29                       ` Rafael J. Wysocki
@ 2009-01-30 14:06                         ` Ingo Molnar
  2009-01-30 21:30                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2009-01-30 14:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner

* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Thursday 29 January 2009, Ingo Molnar wrote:
> > 
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 
> > > On Tuesday 27 January 2009, Ingo Molnar wrote:
> > > > 
> > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > > 
> > > > > > In fact whatever check you put in it's _always_ going to be 
> > > > > > fundamentally more fragile than direct instrumentation: you cannot 
> > > > > > possibly check all possible places that enable interrupts. (they could 
> > > > > > be disabling interrupts as a _restore_irqs() sequence for example)
> > > > > 
> > > > > In this particular case, I'm not really interested in that.  What I'm 
> > > > > interested in is which driver's ->suspend_late() or ->resume_early() (or 
> > > > > the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> > > > > to check directly.
> > > > 
> > > > But this is exactly what it does - without any need for debug checks 
> > > > spread around!
> > > > 
> > > > You'll get a _full stack dump_ from the very driver that is enabling 
> > > > interrupts! You dont get a trace - you get a stack dump of the very place 
> > > > that is buggy. It does not get any better than that.
> > > 
> > > I'm not going to argue.
> > > 
> > > Nevertheless, IMO something like the patch below should be sufficient to catch
> > > these bugs.
> > > 
> > > Thanks,
> > > Rafael
> > > 
> > > 
> > > ---
> > >  drivers/base/power/main.c |   12 ++++++++++++
> > >  drivers/base/sys.c        |   21 ++++++++++++++++-----
> > >  include/linux/pm.h        |   18 ++++++++++++++++++
> > >  3 files changed, 46 insertions(+), 5 deletions(-)
> > 
> > hm, so now you sprinkle debug checks all around the code, instead of 
> > putting in a single pair of:
> > 
> >     force_irqs_off_start();
> >     ...
> >     force_irqs_off_end();
> 
> And what debug options exactly would that require to be set to work?

hm, if you worry about that aspect: we could make it seemlessly enabled if 
PM_DEBUG is enabled.

> > which would catch everything that your checks would catch - and it 
> > would catch more.
> 
> Except that the checks trigger in specific places, so if a check 
> triggers you know precisely where the bug happened regardless of what 
> garbage is in the call trace.

This argument is 100% mystery to me. Do you really not see the quality 
difference between a stack trace generated _right at the buggy piece of 
code_ and a warning later on that might (or might not) trigger?

Especially considering that your approach wont catch such bugs:

   ...
   spin_unlock_irq();
   ...
   spin_lock_irq();
   ...

Or such bugs:

   local_irq_enable();
   ...
   local_irq_disable();

Or such bugs:

   spin_lock_irq_save(&lock1, flags);
   ...
           spin_lock_irqsave(&lock2, flags);
           ...
           spin_unlock_irq(&lock2);          /* accidental bug */
   ...
   spin_unlock_irq_restore(&lock1, flags);

Such types of bugs might be especially hard to find in practice, if the 
window where irqs are enabled is small. There is no guarantee at all that 
accidental irq enabling survives a critical section - it can be 
re-disabled in the normal flow of things very easily.

And even if we are lucky and if the irqs stay enabled by the time the 
callback returns, what if your warning flags some big and complex driver, 
one line of which is buggy?

If you had the choice, what would you prefer - a stack dump done at the 
point of incident (pinpointing the driver, the subsystem and the buggy 
function with its full callframe), or your "this driver is buggy" generic 
warning with no specificity about where that bug might be?

Stacktraces _at the point of incident_, and a _guaranteed_ facility that 
_enforces_ that irqs are off during the whole resume cycle are just about 
the highest quality debug info and debug protection we can get in such 
situations.

I really dont understand your points about this.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible
  2009-01-30 14:06                         ` Ingo Molnar
@ 2009-01-30 21:30                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2009-01-30 21:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frédéric Weisbecker, Steven Rostedt, Linus Torvalds,
	Maciej Rutecki, Linux Kernel Mailing List, Andrew Morton,
	Thomas Gleixner

On Friday 30 January 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > On Thursday 29 January 2009, Ingo Molnar wrote:
> > > 
> > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > 
> > > > On Tuesday 27 January 2009, Ingo Molnar wrote:
> > > > > 
> > > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > > > 
> > > > > > > In fact whatever check you put in it's _always_ going to be 
> > > > > > > fundamentally more fragile than direct instrumentation: you cannot 
> > > > > > > possibly check all possible places that enable interrupts. (they could 
> > > > > > > be disabling interrupts as a _restore_irqs() sequence for example)
> > > > > > 
> > > > > > In this particular case, I'm not really interested in that.  What I'm 
> > > > > > interested in is which driver's ->suspend_late() or ->resume_early() (or 
> > > > > > the equivalents for sysdevs) has enabled interrupts, which is quite easy 
> > > > > > to check directly.
> > > > > 
> > > > > But this is exactly what it does - without any need for debug checks 
> > > > > spread around!
> > > > > 
> > > > > You'll get a _full stack dump_ from the very driver that is enabling 
> > > > > interrupts! You dont get a trace - you get a stack dump of the very place 
> > > > > that is buggy. It does not get any better than that.
> > > > 
> > > > I'm not going to argue.
> > > > 
> > > > Nevertheless, IMO something like the patch below should be sufficient to catch
> > > > these bugs.
> > > > 
> > > > Thanks,
> > > > Rafael
> > > > 
> > > > 
> > > > ---
> > > >  drivers/base/power/main.c |   12 ++++++++++++
> > > >  drivers/base/sys.c        |   21 ++++++++++++++++-----
> > > >  include/linux/pm.h        |   18 ++++++++++++++++++
> > > >  3 files changed, 46 insertions(+), 5 deletions(-)
> > > 
> > > hm, so now you sprinkle debug checks all around the code, instead of 
> > > putting in a single pair of:
> > > 
> > >     force_irqs_off_start();
> > >     ...
> > >     force_irqs_off_end();
> > 
> > And what debug options exactly would that require to be set to work?
> 
> hm, if you worry about that aspect: we could make it seemlessly enabled if 
> PM_DEBUG is enabled.

That would be useful, but OTOH I'd rather not like PM_DEBUG to select multiple
tracing options.  Perhaps it's better to add PM_CHECK_IRQS or something similar
and make that depend on PM_DEBUG and whatever else is necessary.

> > > which would catch everything that your checks would catch - and it 
> > > would catch more.
> > 
> > Except that the checks trigger in specific places, so if a check 
> > triggers you know precisely where the bug happened regardless of what 
> > garbage is in the call trace.
> 
> This argument is 100% mystery to me. Do you really not see the quality 
> difference between a stack trace generated _right at the buggy piece of 
> code_ and a warning later on that might (or might not) trigger?
> 
> Especially considering that your approach wont catch such bugs:
> 
>    ...
>    spin_unlock_irq();
>    ...
>    spin_lock_irq();
>    ...
> 
> Or such bugs:
> 
>    local_irq_enable();
>    ...
>    local_irq_disable();
> 
> Or such bugs:
> 
>    spin_lock_irq_save(&lock1, flags);
>    ...
>            spin_lock_irqsave(&lock2, flags);
>            ...
>            spin_unlock_irq(&lock2);          /* accidental bug */
>    ...
>    spin_unlock_irq_restore(&lock1, flags);

I didn't think about that.

I see a value of having this kind of things trigger a warning, but also I see
a value of having some checks in the code, independent of any extra debug
options.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-01-30 21:31 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-17  8:58 [Linux 2.6.29-rc2] BUG: using smp_processor_id() in preemptible Maciej Rutecki
2009-01-25  2:37 ` Andrew Morton
2009-01-25 11:57   ` Thomas Gleixner
2009-01-25 14:19   ` Maciej Rutecki
2009-01-25 16:30   ` Maciej Rutecki
2009-01-26 16:07 ` Linus Torvalds
2009-01-26 16:53   ` Rafael J. Wysocki
2009-01-26 17:41     ` Ingo Molnar
2009-01-26 19:21       ` Rafael J. Wysocki
2009-01-26 20:35         ` Ingo Molnar
2009-01-26 20:48           ` Rafael J. Wysocki
2009-01-26 21:35             ` Ingo Molnar
2009-01-27 15:28               ` Rafael J. Wysocki
2009-01-27 15:49                 ` Ingo Molnar
2009-01-27 21:18                   ` Rafael J. Wysocki
2009-01-29 15:07                     ` Ingo Molnar
2009-01-29 22:29                       ` Rafael J. Wysocki
2009-01-30 14:06                         ` Ingo Molnar
2009-01-30 21:30                           ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox