public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* s2r badness
@ 2009-07-18 20:03 Peter Zijlstra
  2009-07-18 20:33 ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2009-07-18 20:03 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Gleixner, Len Brown, Greg Kroah-Hartman; +Cc: lkml

Hi,

s2r seems to resume again on my latest kernel build (used to be broken
for a long-ish while).

However it does generate the blow splats.

[   74.576953] ------------[ cut here ]------------
[   74.576953] WARNING: at /usr/src/linux-2.6/kernel/softirq.c:143 local_bh_enable_ip+0x8b/0xc0()
[   74.576953] Hardware name: 2242CTO
[   74.576953] Modules linked in: cryptd aes_x86_64 aes_generic ecryptfs
nfs lockd nfs_acl auth_rpcgss sunrpc binfmt_misc fbcon tileblit font
bitblit softcursor bridge stp llc bnep autofs4 acpi_cpufreq joydev btusb
coretemp sbp2 snd_hda_codec_conexant pcmcia ecb psmouse serio_raw
iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec snd_pcm_oss
pcspkr snd_mixer_oss iwlagn iwlcore snd_pcm mac80211 snd_seq_dummy
ricoh_mmc sdhci_pci sdhci snd_seq_oss snd_seq_midi snd_rawmidi
snd_seq_midi_event yenta_socket rsrc_nonstatic pcmcia_core cfg80211
snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i915
intel_agp thinkpad_acpi led_class video nvram ohci1394 ieee1394 ehci_hcd
uhci_hcd fuse
[   74.576953] Pid: 5406, comm: bash Not tainted 2.6.31-rc3-tip #32
[   74.576953] Call Trace:
[   74.576953]  [<ffffffff8107401b>] ? local_bh_enable_ip+0x8b/0xc0
[   74.576953]  [<ffffffff8106c838>] warn_slowpath_common+0x78/0xd0
[   74.576953]  [<ffffffff8106c89f>] warn_slowpath_null+0xf/0x20
[   74.576953]  [<ffffffff8107401b>] local_bh_enable_ip+0x8b/0xc0
[   74.576953]  [<ffffffff814d08d9>] _spin_unlock_bh+0x19/0x20
[   74.576953]  [<ffffffff810d2c77>] load_debug_registers+0x67/0x80
[   74.576953]  [<ffffffff813cbcdb>] restore_processor_state+0x22b/0x280
[   74.576953]  [<ffffffff81286643>] acpi_suspend_enter+0x6a/0xeb
[   74.576953]  [<ffffffff810a217f>] suspend_devices_and_enter+0x22f/0x240
[   74.576953]  [<ffffffff810a22b9>] enter_state+0x129/0x150
[   74.576953]  [<ffffffff810a17da>] state_store+0xba/0x100
[   74.576953]  [<ffffffff81248a17>] kobj_attr_store+0x17/0x20
[   74.576953]  [<ffffffff81176155>] sysfs_write_file+0xc5/0x140
[   74.576953]  [<ffffffff8111402b>] vfs_write+0xcb/0x1a0
[   74.576953]  [<ffffffff811141f0>] sys_write+0x50/0x90
[   74.576953]  [<ffffffff8102e242>] system_call_fastpath+0x16/0x1b
[   74.576953] ---[ end trace 58ad491f88b81455 ]---


[   75.906997] Call Trace:
[   75.907001]  <IRQ>  [<ffffffff810b63c6>] __report_bad_irq+0x26/0xa0
[   75.907022]  [<ffffffff810b65c8>] note_interrupt+0x188/0x1d0
[   75.907029]  [<ffffffff810b6c85>] handle_fasteoi_irq+0xd5/0x100
[   75.907038]  [<ffffffff81030b7f>] handle_irq+0x1f/0x30
[   75.907043]  [<ffffffff8103029e>] do_IRQ+0x6e/0xf0
[   75.907052]  [<ffffffff8102ec53>] ret_from_intr+0x0/0x11
[   75.907055]  <EOI>  [<ffffffff81096130>] ? tick_nohz_stop_sched_tick+0x70/0x3a0
[   75.907072]  [<ffffffff8102ca0a>] ? cpu_idle+0x2a/0xd0
[   75.907080]  [<ffffffff814ba485>] ? rest_init+0x75/0x80
[   75.907090]  [<ffffffff817aed55>] ? start_kernel+0x37a/0x42a
[   75.907098]  [<ffffffff817ae299>] ? x86_64_start_reservations+0x99/0xb9
[   75.907106]  [<ffffffff817ae3be>] ? x86_64_start_kernel+0x105/0x120
[   75.907113]  [<ffffffff817ae140>] ? early_idt_handler+0x0/0x71
[   75.907117] handlers:
[   75.907121] [<ffffffff8137fe10>] (usb_hcd_irq+0x0/0x90)
[   75.907130] Disabling IRQ #18



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s2r badness
  2009-07-18 20:03 s2r badness Peter Zijlstra
@ 2009-07-18 20:33 ` Thomas Gleixner
  2009-07-20 18:20   ` K.Prasad
  2009-07-20 21:00   ` Steven Rostedt
  0 siblings, 2 replies; 8+ messages in thread
From: Thomas Gleixner @ 2009-07-18 20:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman, lkml,
	Steven Rostedt

On Sat, 18 Jul 2009, Peter Zijlstra wrote:

> Hi,
> 
> s2r seems to resume again on my latest kernel build (used to be broken
> for a long-ish while).
> 
> However it does generate the blow splats.
> 
> [   74.576953] ------------[ cut here ]------------
> [   74.576953] WARNING: at /usr/src/linux-2.6/kernel/softirq.c:143 local_bh_enable_ip+0x8b/0xc0()
> [   74.576953] Hardware name: 2242CTO
> [   74.576953] Modules linked in: cryptd aes_x86_64 aes_generic ecryptfs
> nfs lockd nfs_acl auth_rpcgss sunrpc binfmt_misc fbcon tileblit font
> bitblit softcursor bridge stp llc bnep autofs4 acpi_cpufreq joydev btusb
> coretemp sbp2 snd_hda_codec_conexant pcmcia ecb psmouse serio_raw
> iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec snd_pcm_oss
> pcspkr snd_mixer_oss iwlagn iwlcore snd_pcm mac80211 snd_seq_dummy
> ricoh_mmc sdhci_pci sdhci snd_seq_oss snd_seq_midi snd_rawmidi
> snd_seq_midi_event yenta_socket rsrc_nonstatic pcmcia_core cfg80211
> snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i915
> intel_agp thinkpad_acpi led_class video nvram ohci1394 ieee1394 ehci_hcd
> uhci_hcd fuse
> [   74.576953] Pid: 5406, comm: bash Not tainted 2.6.31-rc3-tip #32
> [   74.576953] Call Trace:
> [   74.576953]  [<ffffffff8107401b>] ? local_bh_enable_ip+0x8b/0xc0
> [   74.576953]  [<ffffffff8106c838>] warn_slowpath_common+0x78/0xd0
> [   74.576953]  [<ffffffff8106c89f>] warn_slowpath_null+0xf/0x20
> [   74.576953]  [<ffffffff8107401b>] local_bh_enable_ip+0x8b/0xc0
> [   74.576953]  [<ffffffff814d08d9>] _spin_unlock_bh+0x19/0x20
> [   74.576953]  [<ffffffff810d2c77>] load_debug_registers+0x67/0x80

That's the hw breakpoint support stuff which does the spin_unlock_bh
at a point where interrupts are disabled. So that's a -tip tree issue,
not mainline.

> [   74.576953]  [<ffffffff813cbcdb>] restore_processor_state+0x22b/0x280
> [   74.576953]  [<ffffffff81286643>] acpi_suspend_enter+0x6a/0xeb
> [   74.576953]  [<ffffffff810a217f>] suspend_devices_and_enter+0x22f/0x240
> [   74.576953]  [<ffffffff810a22b9>] enter_state+0x129/0x150
> [   74.576953]  [<ffffffff810a17da>] state_store+0xba/0x100
> [   74.576953]  [<ffffffff81248a17>] kobj_attr_store+0x17/0x20
> [   74.576953]  [<ffffffff81176155>] sysfs_write_file+0xc5/0x140
> [   74.576953]  [<ffffffff8111402b>] vfs_write+0xcb/0x1a0
> [   74.576953]  [<ffffffff811141f0>] sys_write+0x50/0x90
> [   74.576953]  [<ffffffff8102e242>] system_call_fastpath+0x16/0x1b
> [   74.576953] ---[ end trace 58ad491f88b81455 ]---
> 
> 
> [   75.906997] Call Trace:
> [   75.907001]  <IRQ>  [<ffffffff810b63c6>] __report_bad_irq+0x26/0xa0
> [   75.907022]  [<ffffffff810b65c8>] note_interrupt+0x188/0x1d0
> [   75.907029]  [<ffffffff810b6c85>] handle_fasteoi_irq+0xd5/0x100
> [   75.907038]  [<ffffffff81030b7f>] handle_irq+0x1f/0x30
> [   75.907043]  [<ffffffff8103029e>] do_IRQ+0x6e/0xf0
> [   75.907052]  [<ffffffff8102ec53>] ret_from_intr+0x0/0x11
> [   75.907055]  <EOI>  [<ffffffff81096130>] ? tick_nohz_stop_sched_tick+0x70/0x3a0
> [   75.907072]  [<ffffffff8102ca0a>] ? cpu_idle+0x2a/0xd0
> [   75.907080]  [<ffffffff814ba485>] ? rest_init+0x75/0x80
> [   75.907090]  [<ffffffff817aed55>] ? start_kernel+0x37a/0x42a
> [   75.907098]  [<ffffffff817ae299>] ? x86_64_start_reservations+0x99/0xb9
> [   75.907106]  [<ffffffff817ae3be>] ? x86_64_start_kernel+0x105/0x120
> [   75.907113]  [<ffffffff817ae140>] ? early_idt_handler+0x0/0x71
> [   75.907117] handlers:
> [   75.907121] [<ffffffff8137fe10>] (usb_hcd_irq+0x0/0x90)
> [   75.907130] Disabling IRQ #18

Does this one happen with mainline as well ?
 
Just to make the list more complete. If tracing is enabled across
suspend/resume you'll hit that one as well:

WARNING: at /home/tglx/work/kernel/git/linux-2.6/kernel/trace/ring_buffer.c:1392 rb_reserve_next_event+0x133/0x27c()

Negative timestamp delta which is probably due to sched_clock not yet
adjusted to the TSC which got reset.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s2r badness
  2009-07-18 20:33 ` Thomas Gleixner
@ 2009-07-20 18:20   ` K.Prasad
  2009-07-20 18:50     ` Peter Zijlstra
  2009-07-20 21:00   ` Steven Rostedt
  1 sibling, 1 reply; 8+ messages in thread
From: K.Prasad @ 2009-07-20 18:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	lkml, Steven Rostedt

On Sat, Jul 18, 2009 at 10:33:08PM +0200, Thomas Gleixner wrote:
> On Sat, 18 Jul 2009, Peter Zijlstra wrote:
> 
> > Hi,
> > 
> > s2r seems to resume again on my latest kernel build (used to be broken
> > for a long-ish while).
> > 
> > However it does generate the blow splats.
> > 
> > [   74.576953] ------------[ cut here ]------------
> > [   74.576953] WARNING: at /usr/src/linux-2.6/kernel/softirq.c:143 local_bh_enable_ip+0x8b/0xc0()
> > [   74.576953] Hardware name: 2242CTO
> > [   74.576953] Modules linked in: cryptd aes_x86_64 aes_generic ecryptfs
> > nfs lockd nfs_acl auth_rpcgss sunrpc binfmt_misc fbcon tileblit font
> > bitblit softcursor bridge stp llc bnep autofs4 acpi_cpufreq joydev btusb
> > coretemp sbp2 snd_hda_codec_conexant pcmcia ecb psmouse serio_raw
> > iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec snd_pcm_oss
> > pcspkr snd_mixer_oss iwlagn iwlcore snd_pcm mac80211 snd_seq_dummy
> > ricoh_mmc sdhci_pci sdhci snd_seq_oss snd_seq_midi snd_rawmidi
> > snd_seq_midi_event yenta_socket rsrc_nonstatic pcmcia_core cfg80211
> > snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc i915
> > intel_agp thinkpad_acpi led_class video nvram ohci1394 ieee1394 ehci_hcd
> > uhci_hcd fuse
> > [   74.576953] Pid: 5406, comm: bash Not tainted 2.6.31-rc3-tip #32
> > [   74.576953] Call Trace:
> > [   74.576953]  [<ffffffff8107401b>] ? local_bh_enable_ip+0x8b/0xc0
> > [   74.576953]  [<ffffffff8106c838>] warn_slowpath_common+0x78/0xd0
> > [   74.576953]  [<ffffffff8106c89f>] warn_slowpath_null+0xf/0x20
> > [   74.576953]  [<ffffffff8107401b>] local_bh_enable_ip+0x8b/0xc0
> > [   74.576953]  [<ffffffff814d08d9>] _spin_unlock_bh+0x19/0x20
> > [   74.576953]  [<ffffffff810d2c77>] load_debug_registers+0x67/0x80
> 
> That's the hw breakpoint support stuff which does the spin_unlock_bh
> at a point where interrupts are disabled. So that's a -tip tree issue,
> not mainline.
>

load_debug_registers() uses spin_(un)lock_bh() and not a spin-lock() to
protect itself from flush_thread_hw_breakpoint() [which is invoked in
softIRQ context].

I'd like to recreate this traceback to investigate and fix this. What
steps, when done cause this message to appear?

Thanks,
K.Prasad


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s2r badness
  2009-07-20 18:20   ` K.Prasad
@ 2009-07-20 18:50     ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2009-07-20 18:50 UTC (permalink / raw)
  To: prasad
  Cc: Thomas Gleixner, Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	lkml, Steven Rostedt

On Mon, 2009-07-20 at 23:50 +0530, K.Prasad wrote:

> > > [   74.576953] ------------[ cut here ]------------
> > > [   74.576953] WARNING: at /usr/src/linux-2.6/kernel/softirq.c:143 local_bh_enable_ip+0x8b/0xc0()
> > > [   74.576953] Hardware name: 2242CTO

> > > [   74.576953] Pid: 5406, comm: bash Not tainted 2.6.31-rc3-tip #32
> > > [   74.576953] Call Trace:
> > > [   74.576953]  [<ffffffff8107401b>] ? local_bh_enable_ip+0x8b/0xc0
> > > [   74.576953]  [<ffffffff8106c838>] warn_slowpath_common+0x78/0xd0
> > > [   74.576953]  [<ffffffff8106c89f>] warn_slowpath_null+0xf/0x20
> > > [   74.576953]  [<ffffffff8107401b>] local_bh_enable_ip+0x8b/0xc0
> > > [   74.576953]  [<ffffffff814d08d9>] _spin_unlock_bh+0x19/0x20
> > > [   74.576953]  [<ffffffff810d2c77>] load_debug_registers+0x67/0x80
> > 
> > That's the hw breakpoint support stuff which does the spin_unlock_bh
> > at a point where interrupts are disabled. So that's a -tip tree issue,
> > not mainline.
> >
> 
> load_debug_registers() uses spin_(un)lock_bh() and not a spin-lock() to
> protect itself from flush_thread_hw_breakpoint() [which is invoked in
> softIRQ context].
> 
> I'd like to recreate this traceback to investigate and fix this. What
> steps, when done cause this message to appear?

Close the lid and open it again?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s2r badness
  2009-07-18 20:33 ` Thomas Gleixner
  2009-07-20 18:20   ` K.Prasad
@ 2009-07-20 21:00   ` Steven Rostedt
  2009-07-20 23:02     ` Thomas Gleixner
  1 sibling, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2009-07-20 21:00 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	lkml



>  
> Just to make the list more complete. If tracing is enabled across
> suspend/resume you'll hit that one as well:
> 
> WARNING: at /home/tglx/work/kernel/git/linux-2.6/kernel/trace/ring_buffer.c:1392 rb_reserve_next_event+0x133/0x27c()
> 
> Negative timestamp delta which is probably due to sched_clock not yet
> adjusted to the TSC which got reset.

Perhaps we need to prevent tracing during a "blackout period" of suspend 
to ram?

-- Steve


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s2r badness
  2009-07-20 21:00   ` Steven Rostedt
@ 2009-07-20 23:02     ` Thomas Gleixner
  2009-07-20 23:51       ` Steven Rostedt
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2009-07-20 23:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	lkml

On Mon, 20 Jul 2009, Steven Rostedt wrote:
> >  
> > Just to make the list more complete. If tracing is enabled across
> > suspend/resume you'll hit that one as well:
> > 
> > WARNING: at /home/tglx/work/kernel/git/linux-2.6/kernel/trace/ring_buffer.c:1392 rb_reserve_next_event+0x133/0x27c()
> > 
> > Negative timestamp delta which is probably due to sched_clock not yet
> > adjusted to the TSC which got reset.
> 
> Perhaps we need to prevent tracing during a "blackout period" of suspend 
> to ram?

Perhaps we need to figure out why this is happening and how we best
deal with it. Disabling functionality just because we can not deal
with it right now is not a solution.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s2r badness
  2009-07-20 23:02     ` Thomas Gleixner
@ 2009-07-20 23:51       ` Steven Rostedt
  2009-07-21  6:38         ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2009-07-20 23:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	lkml


On Tue, 21 Jul 2009, Thomas Gleixner wrote:

> On Mon, 20 Jul 2009, Steven Rostedt wrote:
> > >  
> > > Just to make the list more complete. If tracing is enabled across
> > > suspend/resume you'll hit that one as well:
> > > 
> > > WARNING: at /home/tglx/work/kernel/git/linux-2.6/kernel/trace/ring_buffer.c:1392 rb_reserve_next_event+0x133/0x27c()
> > > 
> > > Negative timestamp delta which is probably due to sched_clock not yet
> > > adjusted to the TSC which got reset.
> > 
> > Perhaps we need to prevent tracing during a "blackout period" of suspend 
> > to ram?
> 
> Perhaps we need to figure out why this is happening and how we best
> deal with it. Disabling functionality just because we can not deal
> with it right now is not a solution.

Heh, suspend to ram is a black magic art. There's voodoo there that causes 
ulcers when you look the wrong way. For example, there's times that simply 
calling smp_processor_id() will reboot the box. Hence, the tracer is very 
intrusive, and we need to prevent it from doing things at certain areas. 
I'm not saying disabling functionality per say, I'm just saying that we 
need to make sure the tracer is not doing something in the guts of 
bringing the CPU back on line when it is not ready.

Are you saying that we need to move the initialization of sched_clock up, 
just to satisfy tracing? If that happens to be the issues here?

-- Steve


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: s2r badness
  2009-07-20 23:51       ` Steven Rostedt
@ 2009-07-21  6:38         ` Thomas Gleixner
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2009-07-21  6:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Rafael J. Wysocki, Len Brown, Greg Kroah-Hartman,
	lkml

On Mon, 20 Jul 2009, Steven Rostedt wrote:
> On Tue, 21 Jul 2009, Thomas Gleixner wrote:
> 
> > On Mon, 20 Jul 2009, Steven Rostedt wrote:
> > > >  
> > > > Just to make the list more complete. If tracing is enabled across
> > > > suspend/resume you'll hit that one as well:
> > > > 
> > > > WARNING: at /home/tglx/work/kernel/git/linux-2.6/kernel/trace/ring_buffer.c:1392 rb_reserve_next_event+0x133/0x27c()
> > > > 
> > > > Negative timestamp delta which is probably due to sched_clock not yet
> > > > adjusted to the TSC which got reset.
> > > 
> > > Perhaps we need to prevent tracing during a "blackout period" of suspend 
> > > to ram?
> > 
> > Perhaps we need to figure out why this is happening and how we best
> > deal with it. Disabling functionality just because we can not deal
> > with it right now is not a solution.
> 
> Heh, suspend to ram is a black magic art. There's voodoo there that causes 
> ulcers when you look the wrong way. For example, there's times that simply 
> calling smp_processor_id() will reboot the box. Hence, the tracer is very 
> intrusive, and we need to prevent it from doing things at certain areas. 
> I'm not saying disabling functionality per say, I'm just saying that we 
> need to make sure the tracer is not doing something in the guts of 
> bringing the CPU back on line when it is not ready.
> 
> Are you saying that we need to move the initialization of sched_clock up, 
> just to satisfy tracing? If that happens to be the issues here?

No, we need to figure out why the WARN_ON happens and what we can/must
do either to fixup the trace clock early enough or to prevent tracing
until the clock is usable again.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-07-21  6:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-18 20:03 s2r badness Peter Zijlstra
2009-07-18 20:33 ` Thomas Gleixner
2009-07-20 18:20   ` K.Prasad
2009-07-20 18:50     ` Peter Zijlstra
2009-07-20 21:00   ` Steven Rostedt
2009-07-20 23:02     ` Thomas Gleixner
2009-07-20 23:51       ` Steven Rostedt
2009-07-21  6:38         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox