From: Ingo Molnar <mingo@elte.hu>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Gregory Haskins <ghaskins@novell.com>,
Steven Rostedt <srostedt@redhat.com>
Subject: Re: [PATCH 1/1] sched: prevent divide by zero error in cpu_avg_load_per_task
Date: Thu, 27 Nov 2008 10:29:41 +0100 [thread overview]
Message-ID: <20081127092941.GA630@elte.hu> (raw)
In-Reply-To: <20081127020554.533035163@goodmis.org>
* Steven Rostedt <rostedt@goodmis.org> wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Impact: fix to divide by zero
>
> While testing the branch profiler, I hit this crash:
>
> divide error: 0000 [#1] PREEMPT SMP
> [...]
> Call Trace:
> <IRQ> <0> [<ffffffff8024fd43>] find_busiest_group+0x3e5/0xcaa
> [<ffffffff8025da75>] rebalance_domains+0x2da/0xa21
> The code for cpu_avg_load_per_task has:
>
> if (rq->nr_running)
> rq->avg_load_per_task = rq->load.weight / rq->nr_running;
>
> The runqueue lock is not held here, and there is nothing that
> prevents the rq->nr_running from going to zero after it passes the
> if condition.
>
> The branch profiler simply made the race window bigger.
>
> This patch saves off the rq->nr_running to a local variable and uses
> that for both the condition and the division.
good catch! Applied to tip/sched/urgent, thanks Steve!
the rebalancer scans remote runqueues without holding the runqueue
lock for performance reasons, so nr_running indeed has to be loaded
into a local variable here.
I think it could hit anywhere upstream as well, even without the
branch tracer: depends on the register pressure and GCC's choices for
reloading that register. If say FRAME_POINTER is enabled and we are
running with some more agressive optimization or just an older/suckier
GCC that does a spurious reload, then this might happen too.
it's not a classic race, it's an SMP bug that will either trigger or
not trigger at all, depending on compiler behavior.
Ingo
next prev parent reply other threads:[~2008-11-27 9:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-27 2:04 [PATCH 0/1] sched: divide by 0 error Steven Rostedt
2008-11-27 2:04 ` [PATCH 1/1] sched: prevent divide by zero error in cpu_avg_load_per_task Steven Rostedt
2008-11-27 9:29 ` Ingo Molnar [this message]
2008-11-29 19:19 ` Linus Torvalds
2008-11-29 19:50 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081127092941.GA630@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@linux-foundation.org \
--cc=ghaskins@novell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.