* [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 @ 2008-11-12 18:30 Maciej Rutecki 2008-11-12 18:56 ` Linus Torvalds 0 siblings, 1 reply; 5+ messages in thread From: Maciej Rutecki @ 2008-11-12 18:30 UTC (permalink / raw) To: Linus Torvalds Cc: Linux Kernel Mailing List, Rafael J. Wysocki, lenb@kernel.org, linux-acpi -rc3 works OK I have this bug during suspend to disk: [ 188.592151] Enabling non-boot CPUs ... [ 188.592151] SMP alternatives: switching to SMP code [ 188.666058] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 [ 188.666064] caller is native_sched_clock+0x2b/0x80 [ 188.666067] Pid: 2934, comm: suspend_to_disk Not tainted 2.6.28-rc4 #1 [ 188.666069] Call Trace: [ 188.666074] [<c02556ff>] debug_smp_processor_id+0xbf/0xd0 [ 188.666077] [<c010a2db>] native_sched_clock+0x2b/0x80 [ 188.666080] [<c03ae3a0>] init_idle+0x88/0x11a [ 188.666083] [<c03ad206>] native_cpu_up+0x4a6/0x75a [ 188.666086] [<c01de808>] sysfs_add_file_mode+0x58/0x90 [ 188.666089] [<c03ad4ba>] do_fork_idle+0x0/0x19 [ 188.666092] [<c014864e>] notifier_call_chain+0x3e/0x80 [ 188.666095] [<c03ae99a>] _cpu_up+0x6a/0x103 [ 188.666097] [<c03a1e75>] enable_nonboot_cpus+0x65/0xa0 [ 188.666101] [<c02cd240>] device_pm_unlock+0x0/0x10 [ 188.666105] [<c0159ba7>] hibernation_snapshot+0x137/0x200 [ 188.666107] [<c0158dff>] freeze_processes+0x4f/0xa0 [ 188.666110] [<c0159d4f>] hibernate+0xdf/0x190 [ 188.666112] [<c0158910>] state_store+0x0/0xd0 [ 188.666115] [<c01589c2>] state_store+0xb2/0xd0 [ 188.666117] [<c0158910>] state_store+0x0/0xd0 [ 188.666120] [<c024b1d4>] kobj_attr_store+0x24/0x30 [ 188.666122] [<c01de221>] sysfs_write_file+0xa1/0x110 [ 188.666124] [<c01de180>] sysfs_write_file+0x0/0x110 [ 188.666127] [<c0198e38>] vfs_write+0xa8/0x140 [ 188.666129] [<c0198fa1>] sys_write+0x41/0x80 [ 188.666131] [<c0103f49>] sysenter_do_call+0x12/0x21 [ 188.666133] [<c03b0000>] acpi_processor_start+0x690/0x69a [ 188.666136] Booting processor 1 APIC 0x1 ip 0x6000 [ 188.590718] Initializing CPU#1 dmesg and config: http://unixy.pl/maciek/download/kernel/2.6.28-rc4_smp/ -- Maciej Rutecki http://www.maciek.unixy.pl ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 2008-11-12 18:30 [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 Maciej Rutecki @ 2008-11-12 18:56 ` Linus Torvalds 2008-11-12 19:21 ` Ingo Molnar 0 siblings, 1 reply; 5+ messages in thread From: Linus Torvalds @ 2008-11-12 18:56 UTC (permalink / raw) To: Maciej Rutecki, Ingo Molnar Cc: Linux Kernel Mailing List, Rafael J. Wysocki, lenb@kernel.org, linux-acpi On Wed, 12 Nov 2008, Maciej Rutecki wrote: > -rc3 works OK > > I have this bug during suspend to disk: > > [ 188.592151] Enabling non-boot CPUs ... > [ 188.592151] SMP alternatives: switching to SMP code > [ 188.666058] BUG: using smp_processor_id() in preemptible [00000000] > code: suspend_to_disk/2934 > [ 188.666064] caller is native_sched_clock+0x2b/0x80 The cause seems to be commit 7cbaef9c83e58bbd4bdd534b09052b6c5ec457d5, "sched: optimize sched_clock() a bit" by Ingo. Which actually comments on the fact that a few callers may need to be updated. That wasn't good. Ingo - it's not acceptable for a latish-rc patch to introduce _known_ bugs and not fixing everything up. Of course, since Maciej has frame pointers disabled, the stack trace isn't entirely reliable, but it looks like the problem is "init_idle()". That thing needs to call sched_clock() with interrupts disabled. Looking at it, I'd also expect that it should have used "sched_clock_cpu()", but I'm leaving that to Ingo to sort out. Ingo? Linus ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 2008-11-12 18:56 ` Linus Torvalds @ 2008-11-12 19:21 ` Ingo Molnar 2008-11-12 20:34 ` Maciej Rutecki 0 siblings, 1 reply; 5+ messages in thread From: Ingo Molnar @ 2008-11-12 19:21 UTC (permalink / raw) To: Linus Torvalds Cc: Maciej Rutecki, Linux Kernel Mailing List, Rafael J. Wysocki, lenb@kernel.org, linux-acpi * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, 12 Nov 2008, Maciej Rutecki wrote: > > > -rc3 works OK > > > > I have this bug during suspend to disk: > > > > [ 188.592151] Enabling non-boot CPUs ... > > [ 188.592151] SMP alternatives: switching to SMP code > > [ 188.666058] BUG: using smp_processor_id() in preemptible [00000000] > > code: suspend_to_disk/2934 > > [ 188.666064] caller is native_sched_clock+0x2b/0x80 > > The cause seems to be commit > 7cbaef9c83e58bbd4bdd534b09052b6c5ec457d5, "sched: optimize > sched_clock() a bit" by Ingo. > > Which actually comments on the fact that a few callers may need to > be updated. That wasn't good. Ingo - it's not acceptable for a > latish-rc patch to introduce _known_ bugs and not fixing everything > up. > > Of course, since Maciej has frame pointers disabled, the stack trace > isn't entirely reliable, but it looks like the problem is > "init_idle()". > > That thing needs to call sched_clock() with interrupts disabled. > Looking at it, I'd also expect that it should have used > "sched_clock_cpu()", but I'm leaving that to Ingo to sort out. Ingo? i've queued up the fix below in tip/sched/urgent - Maciej, could you please test it? I've build and boot tested it. I picked the simplest solution - to widen the (init only) use of the rq lock there, which is irqsafe already. Another question is: how/where does the hibernation code call init_idle() with interrupts and preemption enabled? All the normal init_idle() sequences are called with irqs off (and at least with preemption off), that's why my testing didnt catch this bug. [ I was under the distinct illusion that i had all callers covered both via review and via testing (there's no that many users of naked sched_clock() left). So there was certainly no bug known to me - my comment was directed at out-of-tree naked-sched_clock() users which do exist. The recommended in-kernel API is cpu_clock(cpu) - except for scheduler-internal use which was the issue here. ] But it was still a late change and the boot warning was unnecessary, sorry about that. Ingo -------------> >From 5cbd54ef470d880fc37fbe4b21eb514806d51e0d Mon Sep 17 00:00:00 2001 From: Ingo Molnar <mingo@elte.hu> Date: Wed, 12 Nov 2008 20:05:50 +0100 Subject: [PATCH] sched: fix init_idle()'s use of sched_clock() Maciej Rutecki reported: > I have this bug during suspend to disk: > > [ 188.592151] Enabling non-boot CPUs ... > [ 188.592151] SMP alternatives: switching to SMP code > [ 188.666058] BUG: using smp_processor_id() in preemptible > [00000000] > code: suspend_to_disk/2934 > [ 188.666064] caller is native_sched_clock+0x2b/0x80 Which, as noted by Linus, was caused by me, via: 7cbaef9c "sched: optimize sched_clock() a bit" Move the rq locking a bit earlier in the initialization sequence, that will make the sched_clock() call in init_idle() non-preemptible. Reported-by: Maciej Rutecki <maciej.rutecki@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/sched.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 3bafbe3..c94baf2 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -5870,6 +5870,8 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu) struct rq *rq = cpu_rq(cpu); unsigned long flags; + spin_lock_irqsave(&rq->lock, flags); + __sched_fork(idle); idle->se.exec_start = sched_clock(); @@ -5877,7 +5879,6 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu) idle->cpus_allowed = cpumask_of_cpu(cpu); __set_task_cpu(idle, cpu); - spin_lock_irqsave(&rq->lock, flags); rq->curr = rq->idle = idle; #if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW) idle->oncpu = 1; ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 2008-11-12 19:21 ` Ingo Molnar @ 2008-11-12 20:34 ` Maciej Rutecki 2008-11-12 20:40 ` Ingo Molnar 0 siblings, 1 reply; 5+ messages in thread From: Maciej Rutecki @ 2008-11-12 20:34 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Linux Kernel Mailing List, Rafael J. Wysocki, lenb@kernel.org, linux-acpi 2008/11/12 Ingo Molnar <mingo@elte.hu>: > > i've queued up the fix below in tip/sched/urgent - Maciej, could you > please test it? I've build and boot tested it. > Suspend to disk works OK now. Thanks. Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com> > > Reported-by: Maciej Rutecki <maciej.rutecki@gmail.com> > Signed-off-by: Ingo Molnar <mingo@elte.hu> > --- > kernel/sched.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/kernel/sched.c b/kernel/sched.c > index 3bafbe3..c94baf2 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -5870,6 +5870,8 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu) > struct rq *rq = cpu_rq(cpu); > unsigned long flags; > > + spin_lock_irqsave(&rq->lock, flags); > + > __sched_fork(idle); > idle->se.exec_start = sched_clock(); > > @@ -5877,7 +5879,6 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu) > idle->cpus_allowed = cpumask_of_cpu(cpu); > __set_task_cpu(idle, cpu); > > - spin_lock_irqsave(&rq->lock, flags); > rq->curr = rq->idle = idle; > #if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW) > idle->oncpu = 1; > > Regards -- Maciej Rutecki http://www.maciek.unixy.pl ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 2008-11-12 20:34 ` Maciej Rutecki @ 2008-11-12 20:40 ` Ingo Molnar 0 siblings, 0 replies; 5+ messages in thread From: Ingo Molnar @ 2008-11-12 20:40 UTC (permalink / raw) To: Maciej Rutecki Cc: Linus Torvalds, Linux Kernel Mailing List, Rafael J. Wysocki, lenb@kernel.org, linux-acpi * Maciej Rutecki <maciej.rutecki@gmail.com> wrote: > 2008/11/12 Ingo Molnar <mingo@elte.hu>: > > > > > i've queued up the fix below in tip/sched/urgent - Maciej, could you > > please test it? I've build and boot tested it. > > > > Suspend to disk works OK now. Thanks. > > Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com> thanks Maciej! Ingo ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-11-12 20:40 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-12 18:30 [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 Maciej Rutecki 2008-11-12 18:56 ` Linus Torvalds 2008-11-12 19:21 ` Ingo Molnar 2008-11-12 20:34 ` Maciej Rutecki 2008-11-12 20:40 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox