public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Haywood <tla@oak.selfip.net>
To: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
	David Miller <davem@davemloft.net>,
	mpm@selenic.com, rjw@sisk.pl, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, torvalds@linux-foundation.org,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [bug] SLOB crash, 2.6.24-rc2
Date: Thu, 15 Nov 2007 12:18:08 +0000	[thread overview]
Message-ID: <473C3900.8010508@oak.selfip.net> (raw)
In-Reply-To: <20071115112820.GA18228@elte.hu>

Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>   
>> On Thursday 15 November 2007 21:43, Ingo Molnar wrote:
>>     
>>> * David Miller <davem@davemloft.net> wrote:
>>>       
>>>> From: Matt Mackall <mpm@selenic.com>
>>>> Date: Wed, 14 Nov 2007 17:37:13 -0600
>>>>
>>>>         
>>>>> No, the usual strategy for debugging problems -outside- SLOB is to
>>>>> switch to another allocator with more extensive debugging facilities.
>>>>>           
>>>> Ok, so the thing we still can do is do a dump_stack() at the list
>>>> debugging assertion trigger points.
>>>>         
>>> ok, i'll first try to trigger it again.
>>>       
>> I had implemented SLOB in userspace, so I resynched and think I found 
>> your problem. Sorry for the attachment format -- this mailer isn't the 
>> best. I'm really computer illiterate when it comes to userspace...
>>     
>
> thx, i'll try your fix in a minute.
>
>   
>> Anyway, I'm really happy to see you're testing and using SLOB upstream
>> :) Is there any particular reason that you're using it?
>>     
>
> i sometimes test SLOB for -rt, but this time it's the result of my 
> "automated random QA" effort, as part of arch/x86 maintainance/QA.
>
> the main trick is to build and booting random "make randconfig" 
> bzImages. That finds build bugs and a good deal of boot hang and crash 
> bugs as well. (it also found a compiler bug already) I can build and 
> boot about 1000 random kernels in 24 hours, and it's all fully 
> automated. I usually run it overnight - when a kernel does not come up 
> due to a bootup hang or crash (or the kernel log signals any exception 
> condition) then the script stops and i can fix it in the morning.
>
> The first step towards this was to get allyesconfig bzImage kernels to 
> build and boot fine. That effort took months (we had many problems in 
> this area) - i think you saw bugreports and fixes from me about that on 
> lkml.
>
> Once that worked reasonably well i made a small Kconfig patch that 
> forcibly selects a "minimum set" of drivers and kernel subsystems that 
> are needed to boot up a testsystem. Once a "make allnoconfig" and a 
> "make allyesconfig" bzImage kernel boots up fine on the testbox all 
> randconfig configs "inbetween" are supposed to build and boot fine as 
> well.
>
> I also have a patch that adds all the x86 boot options like nosmp, 
> maxcpus=1, nohz=off, hpet=disable to be selectable as .config options - 
> so those boot options are randomized as well.
>
> I also have a small patch that disables half a dozen drivers/features 
> that are not expected to work out of box in a bzImage kernel. (such as 
> ISA drivers that assume the presence of hardware, or root filesystem 
> features such as NFSROOT)
>
> the resulting make randconfig kernel still has 99% of the degrees of 
> freedom that a stock make randconfig kernel has, so by all practical 
> purposes it's a fully random kernel - it just happens to boot on my 
> testsystem all the time.
>
> A successful bootup means the test system is able to boot up into a 
> stock Fedora 8 userspace and is able to bring up its network interfaces 
> and ssh out (automatically) to the build box to signal the completion of 
> a successful test cycle. The logs are also analyzed for lockdep 
> assertions (if lockdep is enabled - which it is in about 20% of the 
> randconfig kernels) and other kernel bugs.
>
> (just in case you were wondering about one of the reasons why the 
> arch/x86 unification merge went so smoothly, with nary a regression ;-) 
> Thomas is doing other types of automated QA of the x86 queue as well.)
>
> this method found the SG-list corruption bugs the following night after 
> Linus committed Jen's SG-list changes, so it's pretty good at finding 
> regressions as early as possible.
>
> 	Ingo
>   

  How complete is the QA testing?  I was reading this interesting thread
and it occurred to me that this sounds like a useful distributed
computing application.  ie a central server with all valid Kconfig
combinations (how many are there?) for a particular release (-rc or
otherwise) across all architectures.  These are allocated to clients on
request to be built / booted etc.  Any errors are fed back to the
central server.  I guess this would be a useful resource for
developers.  More importantly (and I don't know if this is the case
already!) a new Linux release (2.6.x) could be "certified" with some
level of testing on known hardware / architectures.

  tbh, I feel sorry for Ingo's machine compiling 1000 random kernels in
24h!  I'm surprised it hasn't called the Samaritans...

Dave.


      parent reply	other threads:[~2007-11-15 12:40 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-11 19:58 2.6.24-rc2: Reported regressions from 2.6.23 (updated) Rafael J. Wysocki
2007-11-11 20:09 ` Alan Cox
2007-11-11 20:34   ` Rafael J. Wysocki
2007-11-11 22:22   ` Bartlomiej Zolnierkiewicz
2007-11-11 22:46     ` Alan Cox
2007-11-13  1:11       ` Andrew Morton
2007-11-13 14:09         ` Thomas Lindroth
     [not found]         ` <3d08dbff0711130534k702f66ebj1f8e91d107eff2a1@mail.gmail.com>
2007-11-13 19:52           ` Andrew Morton
2007-11-11 20:30 ` Ingo Molnar
2007-11-11 20:33 ` Francois Romieu
2007-11-14 11:20 ` [bug] SLOB crash, 2.6.24-rc2 Ingo Molnar
2007-11-14 17:36   ` Matt Mackall
2007-11-14 18:39     ` Matt Mackall
2007-11-14 19:05       ` Ingo Molnar
2007-11-14 19:42         ` Matt Mackall
2007-11-14 22:39         ` David Miller
2007-11-14 22:53           ` Matt Mackall
2007-11-14 23:10             ` David Miller
2007-11-14 23:37               ` Matt Mackall
2007-11-14 23:41                 ` David Miller
2007-11-15  0:09                   ` Matt Mackall
2007-11-15 10:43                   ` Ingo Molnar
2007-11-15 10:51                     ` David Miller
2007-11-15 11:03                       ` Ingo Molnar
2007-11-15 11:05                         ` David Miller
2007-11-15 10:57                     ` Nick Piggin
2007-11-15 11:28                       ` Ingo Molnar
2007-11-15 11:32                         ` [patch] slob: fix memory corruption Ingo Molnar
2007-11-15 12:48                           ` Ingo Molnar
2007-11-15 20:25                             ` Nick Piggin
2007-11-15 16:00                           ` Matt Mackall
2007-11-15 11:39                         ` [bug] SLOB crash, 2.6.24-rc2 Nick Piggin
2007-11-15 12:18                         ` Dave Haywood [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=473C3900.8010508@oak.selfip.net \
    --to=tla@oak.selfip.net \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mpm@selenic.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rjw@sisk.pl \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox