From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [testcase] perf: yet another fuzzer triggered crash Date: Mon, 1 Jul 2013 11:07:13 +0200 Message-ID: <20130701090713.GO6626@twins.programming.kicks-ass.net> References: Mime-Version: 1.0 Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Vince Weaver Cc: linux-kernel@vger.kernel.org, Paul Mackerras , Ingo Molnar , Arnaldo Carvalho de Melo , trinity@vger.kernel.org On Fri, Jun 28, 2013 at 05:07:38PM -0400, Vince Weaver wrote: > On Fri, 28 Jun 2013, Vince Weaver wrote: > > > On Fri, 14 Jun 2013, Vince Weaver wrote: > > > > > OK, I haven't managed to get a small reproducible test case for the system > > > crash yet > > > > I wasted the last 2 days bisecting a 10000 syscall trace, but below is a > > 20-syscall testcase that rapidly makes a core2 machine running 3.10-rc7 > > unusable. > > and it turns out I might have bisected down too much, as though that > crashes my core2 system it doesn't crash newer machines. > > I'm too lazy to re-bisect today, but the much longer program here: > http://web.eece.maine.edu/~vweaver/files/nmi_bug_snb.c > reliably causes the same crash on a Sandybridge machine I have running 3.9 OK, so on my westmere it triggers that WARN in task_ctx_sched_out() a _lot_ (I removed the ONCE for easier debugging earlier -- still kinda stumped there). Then this thing causes an RCU stall and starts triggering NMI watchdog msgs.. so YAY! :-) I'll see what I can find.