All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: John Sigler <linux.kernel@free.fr>
Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
	linux-pci@atrey.karlin.mff.cuni.cz
Subject: Re: How to debug complete kernel lock-ups
Date: Wed, 24 Oct 2007 08:56:37 -0700	[thread overview]
Message-ID: <20071024155637.GA19062@kroah.com> (raw)
In-Reply-To: <471F0DB4.1080709@free.fr>

On Wed, Oct 24, 2007 at 11:17:40AM +0200, John Sigler wrote:
> John Sigler wrote:
>
>> I have an x86 system with two PCI slots, in which I inserted two
>> specialized output cards (Dektec DTA-105).
>> http://www.dektec.com/Products/DTA-105/
>> (They provide an open source driver.)
>> My problem is: when I write to the 4 ports (each card has 2 ports) "at the 
>> same time" (not really "at the same time" because I have a uni-processor 
>> system, so "within a short time frame" is more accurate) the system 
>> *completely* locks up.
>> The manufacturer told me they had seen the problem in their lab. I'm just 
>> trying to provide some helpful debug output to speed up the process of 
>> fixing the problem :-)
>> I've built a debug 2.6.22.1-rt9 kernel, hoping to get the kernel to dump 
>> something, anything.
>> +CONFIG_KALLSYMS_ALL=y
>> +CONFIG_PCI_DEBUG=y
>> +CONFIG_DEBUG_DRIVER=y
>> +CONFIG_PRINTK_TIME=y
>> +CONFIG_MAGIC_SYSRQ=y
>> +CONFIG_DEBUG_KERNEL=y
>> +CONFIG_DEBUG_SHIRQ=y
>> +CONFIG_DETECT_SOFTLOCKUP=y
>> +CONFIG_DEBUG_SLAB=y
>> +CONFIG_DEBUG_SLAB_LEAK=y
>> +CONFIG_DEBUG_PREEMPT=y
>> +CONFIG_DEBUG_RT_MUTEXES=y
>> +CONFIG_DEBUG_PI_LIST=y
>> +CONFIG_RT_MUTEX_TESTER=y
>> +CONFIG_DEBUG_SPINLOCK=y
>> +CONFIG_DEBUG_MUTEXES=y
>> +CONFIG_DEBUG_LOCK_ALLOC=y
>> +CONFIG_PROVE_LOCKING=y
>> +CONFIG_LOCKDEP=y
>> +CONFIG_TRACE_IRQFLAGS=y
>> +CONFIG_DEBUG_SPINLOCK_SLEEP=y
>> +CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
>> +CONFIG_STACKTRACE=y
>> +CONFIG_PREEMPT_TRACE=y
>> +CONFIG_DEBUG_BUGVERBOSE=y
>> +CONFIG_DEBUG_INFO=y
>> +CONFIG_FRAME_POINTER=y
>> +CONFIG_FORCED_INLINING=y
>> +CONFIG_DEBUG_STACKOVERFLOW=y
>> +CONFIG_DEBUG_RODATA=y
>> +CONFIG_4KSTACKS=y
>> I've enabled the serial console, and used SysRq to bump the console level 
>> to 9 (I want everything, even KERN_DEBUG output).
>> I've enabled the IO-APIC watchdog (nmi_watchdog=1).
>> Once the system locks up, I get no output, no panic, no oops.
>> The serial console is frozen, my ssh sessions are frozen.
>> Suppose the PCI bus "crashes" (whatever that means) or locks up.
>> Would that make the system completely unresponsive? The I/O does have to 
>> get to/from the south bridge, through the PCI bus AFAIU. I can imagine 
>> that a locked PCI bus would be slightly problematic.
>> Does this mean I need some kind of PCI bus analyzer (i.e. hardware) at 
>> this point? Is there anything more I can try?
>
> I've tested with a vanilla 2.6.22.10 kernel (no PREEMPT_RT patch).
> That system also locks up and remains completely unresponsive (I can't open 
> new ssh sessions, the system won't answer ICMP echo requests).
>
> How do driver writers deal with complete kernel hangs?

We slowly go crazy :)

Seriously, try to add debugging messages for where you think things
might be dying and slowly start working from there.  It's not a quick
thing to do at times...

Oh, try using kdb, that sometimes will work for people, depending on
your hardware and problem.

good luck,

greg k-h

  reply	other threads:[~2007-10-24 15:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-23 16:11 How to debug complete kernel lock-ups John Sigler
2007-10-24  7:03 ` Alessio Igor Bogani
2007-10-24  9:17 ` John Sigler
2007-10-24 15:56   ` Greg KH [this message]
2007-10-24 16:19   ` Ray Lee
2007-10-25  4:06   ` Grant Grundler
2007-10-31  9:25   ` John Sigler
2007-10-31 21:28     ` Ray Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071024155637.GA19062@kroah.com \
    --to=greg@kroah.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@atrey.karlin.mff.cuni.cz \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linux.kernel@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.