From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934118AbXCVRIN (ORCPT ); Thu, 22 Mar 2007 13:08:13 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S934128AbXCVRIN (ORCPT ); Thu, 22 Mar 2007 13:08:13 -0400 Received: from hellhawk.shadowen.org ([80.68.90.175]:2814 "EHLO hellhawk.shadowen.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934118AbXCVRIM (ORCPT ); Thu, 22 Mar 2007 13:08:12 -0400 Message-ID: <4602B7D3.4030108@shadowen.org> Date: Thu, 22 Mar 2007 17:07:31 +0000 From: Andy Whitcroft User-Agent: Icedove 1.5.0.9 (X11/20061220) MIME-Version: 1.0 To: Con Kolivas CC: Andrew Morton , linux-kernel@vger.kernel.org, Steve Fox , "Martin J. Bligh" Subject: Re: 2.6.21-rc4-mm1 References: <20070319205623.299d0378.akpm@linux-foundation.org> <4602413C.6000504@shadowen.org> <46025100.7060103@shadowen.org> <200703222104.06507.kernel@kolivas.org> In-Reply-To: <200703222104.06507.kernel@kolivas.org> X-Enigmail-Version: 0.94.2.0 OpenPGP: url=http://www.shadowen.org/~apw/public-key Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Con Kolivas wrote: > On Thursday 22 March 2007 20:48, Andy Whitcroft wrote: >> Andy Whitcroft wrote: >>> Andy Whitcroft wrote: >>>> Andrew Morton wrote: >>>>> Temporarily at >>>>> >>>>> http://userweb.kernel.org/~akpm/2.6.21-rc4-mm1/ >>>>> >>>>> Will appear later at >>>>> >>>>> >>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc >>>>> 4/2.6.21-rc4-mm1/ >>>> [All of the below is from the pre hot-fix runs. The very few results >>>> which are in for the hot-fix runs seem worse if anything. :( All >>>> results should be out on TKO.] >>>> >>>>> - Restored the RSDL CPU scheduler (a new version thereof) >>>> Unsure if the above is the culprit but there seems to be a smattering of >>>> BUG's in kernbench from the schedular on several systems, and panics >>>> which do not fully dump out. >>>> >>>> elm3b239 is about 2/4 kernbench being the test in progress when we >>>> blammo in both failed tests, elm3b234 doesn't boot at all. >>> Well I have one result through for backing RSDL out on elm3b239 and that >>> does indeed seem to give us a successful boot and test. peterz has >>> pointed me to an incremental patch from Con which I'll push through >>> testing and see if that sorts it out. >> Ok, tested the patch below on top of 2.6.21-rc4-mm1 and this seems to >> fix the problem: >> >> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc4-mm1-rsdl-0.32.p >> atch >> >> Hard to tell from that patch whether it will be fixed in the changes >> already committed to the next -mm. >> >> Its possible that it may be fixed by the following patch: >> >> sched-rsdl-improvements.patch >> >> Which has the following slipped in at the end of the changelog: >> >> A tiny change checking for MAX_PRIO in normal_prio() >> may prevent oopses on bootup on large SMP due to >> forking off the idle task. >> >> Con, are all the changes in the 0.32 patch above with akpm? > > Yes he's queued everything in that patch you tested for the next -mm. Thanks > very much for testing it. No worries. I've just got through the results on the other machine in the mix. That machine seems to be fixed by backing out RSDL and not by the fixup 0.32 patch ... This second machine seems to had hard very soon after user space starts executing but without a panic. I can't say that the symptoms are very definitive, but I do have a good result from that machine without RSDL and not with rsdl-0.32. The machine is a dual-core x86_64 machine: Dual Core AMD Opteron(tm) Processor 275. I'll let you know if I find out anything else. Shout if you want any information or have anything you want poked or tested. -apw