public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: John Sigler <linux.kernel@free.fr>
Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
	linux-pci@atrey.karlin.mff.cuni.cz
Subject: Re: How to debug complete kernel lock-ups
Date: Wed, 24 Oct 2007 08:56:37 -0700	[thread overview]
Message-ID: <20071024155637.GA19062@kroah.com> (raw)
In-Reply-To: <471F0DB4.1080709@free.fr>

On Wed, Oct 24, 2007 at 11:17:40AM +0200, John Sigler wrote:
> John Sigler wrote:
>
>> I have an x86 system with two PCI slots, in which I inserted two
>> specialized output cards (Dektec DTA-105).
>> http://www.dektec.com/Products/DTA-105/
>> (They provide an open source driver.)
>> My problem is: when I write to the 4 ports (each card has 2 ports) "at the 
>> same time" (not really "at the same time" because I have a uni-processor 
>> system, so "within a short time frame" is more accurate) the system 
>> *completely* locks up.
>> The manufacturer told me they had seen the problem in their lab. I'm just 
>> trying to provide some helpful debug output to speed up the process of 
>> fixing the problem :-)
>> I've built a debug 2.6.22.1-rt9 kernel, hoping to get the kernel to dump 
>> something, anything.
>> +CONFIG_KALLSYMS_ALL=y
>> +CONFIG_PCI_DEBUG=y
>> +CONFIG_DEBUG_DRIVER=y
>> +CONFIG_PRINTK_TIME=y
>> +CONFIG_MAGIC_SYSRQ=y
>> +CONFIG_DEBUG_KERNEL=y
>> +CONFIG_DEBUG_SHIRQ=y
>> +CONFIG_DETECT_SOFTLOCKUP=y
>> +CONFIG_DEBUG_SLAB=y
>> +CONFIG_DEBUG_SLAB_LEAK=y
>> +CONFIG_DEBUG_PREEMPT=y
>> +CONFIG_DEBUG_RT_MUTEXES=y
>> +CONFIG_DEBUG_PI_LIST=y
>> +CONFIG_RT_MUTEX_TESTER=y
>> +CONFIG_DEBUG_SPINLOCK=y
>> +CONFIG_DEBUG_MUTEXES=y
>> +CONFIG_DEBUG_LOCK_ALLOC=y
>> +CONFIG_PROVE_LOCKING=y
>> +CONFIG_LOCKDEP=y
>> +CONFIG_TRACE_IRQFLAGS=y
>> +CONFIG_DEBUG_SPINLOCK_SLEEP=y
>> +CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
>> +CONFIG_STACKTRACE=y
>> +CONFIG_PREEMPT_TRACE=y
>> +CONFIG_DEBUG_BUGVERBOSE=y
>> +CONFIG_DEBUG_INFO=y
>> +CONFIG_FRAME_POINTER=y
>> +CONFIG_FORCED_INLINING=y
>> +CONFIG_DEBUG_STACKOVERFLOW=y
>> +CONFIG_DEBUG_RODATA=y
>> +CONFIG_4KSTACKS=y
>> I've enabled the serial console, and used SysRq to bump the console level 
>> to 9 (I want everything, even KERN_DEBUG output).
>> I've enabled the IO-APIC watchdog (nmi_watchdog=1).
>> Once the system locks up, I get no output, no panic, no oops.
>> The serial console is frozen, my ssh sessions are frozen.
>> Suppose the PCI bus "crashes" (whatever that means) or locks up.
>> Would that make the system completely unresponsive? The I/O does have to 
>> get to/from the south bridge, through the PCI bus AFAIU. I can imagine 
>> that a locked PCI bus would be slightly problematic.
>> Does this mean I need some kind of PCI bus analyzer (i.e. hardware) at 
>> this point? Is there anything more I can try?
>
> I've tested with a vanilla 2.6.22.10 kernel (no PREEMPT_RT patch).
> That system also locks up and remains completely unresponsive (I can't open 
> new ssh sessions, the system won't answer ICMP echo requests).
>
> How do driver writers deal with complete kernel hangs?

We slowly go crazy :)

Seriously, try to add debugging messages for where you think things
might be dying and slowly start working from there.  It's not a quick
thing to do at times...

Oh, try using kdb, that sometimes will work for people, depending on
your hardware and problem.

good luck,

greg k-h

  reply	other threads:[~2007-10-24 15:56 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-23 16:11 How to debug complete kernel lock-ups John Sigler
2007-10-24  9:17 ` John Sigler
2007-10-24 15:56   ` Greg KH [this message]
2007-10-24 16:19   ` Ray Lee
2007-10-25  4:06   ` Grant Grundler
2007-10-31  9:25   ` John Sigler
2007-10-31 21:28     ` Ray Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071024155637.GA19062@kroah.com \
    --to=greg@kroah.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@atrey.karlin.mff.cuni.cz \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linux.kernel@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox