From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
"lenb@kernel.org" <lenb@kernel.org>,
linux-acpi@vger.kernel.org
Subject: Re: [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934
Date: Wed, 12 Nov 2008 20:21:28 +0100 [thread overview]
Message-ID: <20081112192128.GA17667@elte.hu> (raw)
In-Reply-To: <alpine.LFD.2.00.0811121046360.3468@nehalem.linux-foundation.org>
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Wed, 12 Nov 2008, Maciej Rutecki wrote:
>
> > -rc3 works OK
> >
> > I have this bug during suspend to disk:
> >
> > [ 188.592151] Enabling non-boot CPUs ...
> > [ 188.592151] SMP alternatives: switching to SMP code
> > [ 188.666058] BUG: using smp_processor_id() in preemptible [00000000]
> > code: suspend_to_disk/2934
> > [ 188.666064] caller is native_sched_clock+0x2b/0x80
>
> The cause seems to be commit
> 7cbaef9c83e58bbd4bdd534b09052b6c5ec457d5, "sched: optimize
> sched_clock() a bit" by Ingo.
>
> Which actually comments on the fact that a few callers may need to
> be updated. That wasn't good. Ingo - it's not acceptable for a
> latish-rc patch to introduce _known_ bugs and not fixing everything
> up.
>
> Of course, since Maciej has frame pointers disabled, the stack trace
> isn't entirely reliable, but it looks like the problem is
> "init_idle()".
>
> That thing needs to call sched_clock() with interrupts disabled.
> Looking at it, I'd also expect that it should have used
> "sched_clock_cpu()", but I'm leaving that to Ingo to sort out. Ingo?
i've queued up the fix below in tip/sched/urgent - Maciej, could you
please test it? I've build and boot tested it.
I picked the simplest solution - to widen the (init only) use of the
rq lock there, which is irqsafe already.
Another question is: how/where does the hibernation code call
init_idle() with interrupts and preemption enabled? All the normal
init_idle() sequences are called with irqs off (and at least with
preemption off), that's why my testing didnt catch this bug.
[ I was under the distinct illusion that i had all callers covered
both via review and via testing (there's no that many users of naked
sched_clock() left). So there was certainly no bug known to me - my
comment was directed at out-of-tree naked-sched_clock() users which
do exist. The recommended in-kernel API is cpu_clock(cpu) - except
for scheduler-internal use which was the issue here. ]
But it was still a late change and the boot warning was unnecessary,
sorry about that.
Ingo
------------->
>From 5cbd54ef470d880fc37fbe4b21eb514806d51e0d Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Wed, 12 Nov 2008 20:05:50 +0100
Subject: [PATCH] sched: fix init_idle()'s use of sched_clock()
Maciej Rutecki reported:
> I have this bug during suspend to disk:
>
> [ 188.592151] Enabling non-boot CPUs ...
> [ 188.592151] SMP alternatives: switching to SMP code
> [ 188.666058] BUG: using smp_processor_id() in preemptible
> [00000000]
> code: suspend_to_disk/2934
> [ 188.666064] caller is native_sched_clock+0x2b/0x80
Which, as noted by Linus, was caused by me, via:
7cbaef9c "sched: optimize sched_clock() a bit"
Move the rq locking a bit earlier in the initialization sequence,
that will make the sched_clock() call in init_idle() non-preemptible.
Reported-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/sched.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 3bafbe3..c94baf2 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5870,6 +5870,8 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu)
struct rq *rq = cpu_rq(cpu);
unsigned long flags;
+ spin_lock_irqsave(&rq->lock, flags);
+
__sched_fork(idle);
idle->se.exec_start = sched_clock();
@@ -5877,7 +5879,6 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu)
idle->cpus_allowed = cpumask_of_cpu(cpu);
__set_task_cpu(idle, cpu);
- spin_lock_irqsave(&rq->lock, flags);
rq->curr = rq->idle = idle;
#if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW)
idle->oncpu = 1;
next prev parent reply other threads:[~2008-11-12 19:21 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-12 18:30 [Linux 2.6.28-rc4] BUG: using smp_processor_id() in preemptible [00000000] code: suspend_to_disk/2934 Maciej Rutecki
2008-11-12 18:56 ` Linus Torvalds
2008-11-12 19:21 ` Ingo Molnar [this message]
2008-11-12 20:34 ` Maciej Rutecki
2008-11-12 20:40 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081112192128.GA17667@elte.hu \
--to=mingo@elte.hu \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maciej.rutecki@gmail.com \
--cc=rjw@sisk.pl \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox