From: Thomas Gleixner <tglx@linutronix.de>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-next@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
linux-scsi@vger.kernel.org, David Miller <davem@davemloft.net>,
Jens Axboe <jens.axboe@oracle.com>,
Mike Anderson <andmike@linux.vnet.ibm.com>
Subject: Re: next-20081119: general protection fault: get_next_timer_interrupt()
Date: Mon, 24 Nov 2008 20:31:08 +0100 (CET) [thread overview]
Message-ID: <alpine.LFD.2.00.0811242018370.3235@localhost.localdomain> (raw)
In-Reply-To: <1227554117.25499.46.camel@localhost.localdomain>
On Mon, 24 Nov 2008, James Bottomley wrote:
> On Mon, 2008-11-24 at 18:43 +0100, Thomas Gleixner wrote:
> > > scsi0 : LSI SAS based MegaRAID driver
> > > Driver 'sd' needs updating - please use bus_type methods
> > > scsi 0:0:0:0: Direct-Access ATA SAMSUNG HE160HJ 0-24 PQ: 0 ANSI: 5
> > > ------------[ cut here ]------------
> > > WARNING: at lib/debugobjects.c:215 debug_print_object+0x4f/0x57()
> > > ODEBUG: free active object type: timer_list
> >
> > That's the cause for your boot crash. The scsi/blk code is freeing a
> > page which contains an active timer, so the timer code references gone
> > memory. You triggered it because DEBUG_PAGEALLOC unmaps the page when
> > it's freed.
> >
> > James, or other scsi experts please.
>
> Well, not sure. Most likely candidate is the new block timer code.
> What seems to be happening is that the queue is being released with
> either an outstanding request (refcounting problem) or ticking timer
> with no work (block timer problem). The way scanning works is that we
> create a request queue for each device we probe and then delete it again
> if nothing appears after the bus settle time. The argument against
> this is that it should show up on every scanned bus. However, these are
> getting rarer; I was just about to write that I hadn't seen it when I
> remembered that all my SCSI testing systems are currently running
> hotplug reporting busses (i.e. don't do scanning). However,
> fortunately, I've also booted voyager recently which does use parallel
> SCSI and doesn't see this either, so it could also be megaraid_sas
> specific.
Yeah, block could it be as well. Jens, Mike ?
One note about not seeing it: We have had such bugs before where the
page was freed but not touched and the timer survived w/o tripping the
system over. Alexander noticed because of DEBUG_PAGEALLOC and you can
also see it by enabling debugobjects, which will give you the nice
backtrace.
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_FREE=y
CONFIG_DEBUG_OBJECTS_TIMERS=Y
and add "debug_objects" to the kernel command line.
> Could you turn on SCSI logging so we can see the sequences. Probably
> since this is boot time, just enable all logging:
>
> echo 0xffffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
>
> (kernel must be compiled with CONFIG_SCSI_LOGGING=y
>
> James
>
>
next prev parent reply other threads:[~2008-11-24 19:31 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <a4423d670811190714k13721ae8s125e63d520892440@mail.gmail.com>
[not found] ` <alpine.LFD.2.00.0811191308030.3119@localhost.localdomain>
2008-11-21 10:50 ` next-20081119: general protection fault: get_next_timer_interrupt() Alexander Beregalov
2008-11-24 17:43 ` Thomas Gleixner
2008-11-24 19:15 ` James Bottomley
2008-11-24 19:31 ` Thomas Gleixner [this message]
2008-11-24 21:35 ` Mike Anderson
2008-11-24 22:33 ` Thomas Gleixner
2008-11-24 23:42 ` malahal
2008-11-25 0:09 ` malahal
2008-11-25 0:57 ` Stephen Rothwell
2008-11-25 2:08 ` malahal
2008-11-25 8:51 ` Jens Axboe
2008-11-25 16:59 ` malahal
2008-11-25 17:14 ` Alexander Beregalov
[not found] ` <a4423d670811250914x3a42e56egd1bf06e6229666ba@mail.gmail.com>
2008-11-25 17:43 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.0811242018370.3235@localhost.localdomain \
--to=tglx@linutronix.de \
--cc=James.Bottomley@HansenPartnership.com \
--cc=a.beregalov@gmail.com \
--cc=andmike@linux.vnet.ibm.com \
--cc=davem@davemloft.net \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox