* Re: PROBLEM: 2.6.7 Linux Kernel Crash While Detecting PCI Devices
2004-08-06 18:22 PROBLEM: 2.6.7 Linux Kernel Crash While Detecting PCI Devices John Riggs
@ 2004-08-17 20:22 ` Jonathan Sambrook
2004-08-17 20:55 ` PROBLEM: 2.6.7 Linux Kernel Crash While Detecting PCI Devices [ahem] Jonathan Sambrook
1 sibling, 0 replies; 4+ messages in thread
From: Jonathan Sambrook @ 2004-08-17 20:22 UTC (permalink / raw)
To: linux-kernel; +Cc: John Riggs, greg
At 12:22 on Fri 06/08/04, jriggs@altiris.com masquerading as 'John Riggs' wrote:
> Summary: 2.6.7 Linux Kernel Crash While Detecting PCI Devices
> Please CC me on any replies.
>
> Hi, I am responsible for maintaining a pre-boot Linux environment, for
> which we use a 2.6.7 linux kernel, booted with the freeloader boot
> loader. Our environment works well on most systems, but on this
> particular model of laptop the kernel crashes before I get a shell
> prompt. From the stack trace, it appears to be crashing during the PCI
> device detection. The root filesystem is loaded into a ramdisk. The
> crash doesn't always reproduce, and I'm not sure what changes that it
> does or does not reproduce. But I see the crash on more than 50% of the
> reboots.
>
> Oops output from ksymoops:
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000008
> c015f846
> *pde = 00000000
> Oops: 0000 [#1]
> CPU: 0
> EIP: 0060:[<c015f846>] Not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010246 (2.6.7)
> eax: 0000000f ebx: df7ab1b8 ecx: c0270970 edx: 00007782
> esi: df7ab178 edi: 00000000 ebp: c0276554 esp: df775ef8
> ds: 007b es: 007b ss: 0068
> Stack: c024ade6 df7ab1b8 df7ab178 df7ab378 df602c00 c019045f 00000000
> c0276554
> c01733b5 df7ab1f0 c0276554 df7ab1f0 df7ab238 c024e1d6 00000000
> 00000009
> df602c00 df7ab378 00000009 00000001 c01734f4 df7ab378 df602c00
> 00000009
> Call Trace:
> [<c019045f>] [<c01733b5>] [<c01734f4>] [<c0173b6c>] [<c0173cb9>]
> [<c023b34
> Code: 8b 47 08 5e 8d 48 68 ff 48 68 0f 88 64 01 00 00 8b 5d 00 53
>
>
> >>EIP; c015f846 <sysfs_add_file+16/a0> <=====
>
> >>ebx; df7ab1b8 <pg0+1f4e51b8/3fd38000>
> >>ecx; c0270970 <console_sem+0/10>
> >>esi; df7ab178 <pg0+1f4e5178/3fd38000>
> >>ebp; c0276554 <class_device_attr_cpuaffinity+0/14>
> >>esp; df775ef8 <pg0+1f4afef8/3fd38000>
>
> Trace; c019045f <class_device_create_file+1f/30>
> Trace; c01733b5 <pci_alloc_child_bus+75/c0>
> Trace; c01734f4 <pci_scan_bridge+b4/200>
> Trace; c0173b6c <pci_scan_child_bus+8c/a0>
> Trace; c0173cb9 <pci_scan_bus_parented+119/140>
>
> Code; c015f846 <sysfs_add_file+16/a0>
> 00000000 <_EIP>:
> Code; c015f846 <sysfs_add_file+16/a0> <=====
> 0: 8b 47 08 mov 0x8(%edi),%eax <=====
> Code; c015f849 <sysfs_add_file+19/a0>
> 3: 5e pop %esi
> Code; c015f84a <sysfs_add_file+1a/a0>
> 4: 8d 48 68 lea 0x68(%eax),%ecx
> Code; c015f84d <sysfs_add_file+1d/a0>
> 7: ff 48 68 decl 0x68(%eax)
> Code; c015f850 <sysfs_add_file+20/a0>
> a: 0f 88 64 01 00 00 js 174 <_EIP+0x174>
> Code; c015f856 <sysfs_add_file+26/a0>
> 10: 8b 5d 00 mov 0x0(%ebp),%ebx
> Code; c015f859 <sysfs_add_file+29/a0>
> 13: 53 push %ebx
>
> <0>Kernel panic: Attempted to kill init!
>
>
>
> Output from lspci -vvv: (Note: lspci seemed to be stuck in an infinite
> loop, printing the last two lines over and over)
> 00:00.0 Class f000: 0001:0000 (rev c3) (prog-if e2)
> Subsystem: 69d5:f000
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV+ VGASnoop-
> ParErr+ Stepping+ SERR+ FastB2B-
> Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort+ <MAbort+ >SERR+ <PERR+
> Latency: 105 (60000ns max), cache line size d5
> Interrupt: pin (c) routed to IRQ 0
> BIST is running
> Region 0: I/O ports at <ignored>
> Region 1: Memory at <ignored> (64-bit, non-prefetchable)
> [disabled]
> Region 3: I/O ports at <ignored>
> Region 4: Memory at <ignored> (64-bit, prefetchable) [disabled]
> Expansion ROM at f0006800 [disabled] [size=2K]
>
> 00:0d.0 Class c024: 0068:24cf (rev 60) (prog-if cf)
> Subsystem: 1c24:44c7
> Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping+ SERR+ FastB2B+
> Status: Cap+ 66Mhz- UDF+ FastB2B+ ParErr- DEVSEL=fast >TAbort+
> <TAbort- <MAbort+ >SERR+ <PERR-
> Interrupt: pin ^[,C ^[(Brouted to IRQ 0
> Region 0: Memory at <ignored> (32-bit, non-prefetchable)
> [disabled]
> Region 1: Memory at <ignored> (low-1M, prefetchable) [disabled]
> Region 2: Memory at <ignored> (low-1M, non-prefetchable)
> [disabled]
> Region 3: I/O ports at <ignored> [disabled]
> Region 4: I/O ports at <ignored> [disabled]
> Region 5: Memory at <ignored> (type 3, non-prefetchable)
> [disabled]
> Expansion ROM at <unassigned> [disabled] [size=2K]
>
> 00:0e.0 Class 27bc: 6ce9:ffff (prog-if 8d)
> Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV+ VGASnoop+
> ParErr+ Stepping+ SERR+ FastB2B-
> Status: Cap+ 66Mhz+ UDF+ FastB2B- ParErr- DEVSEL=?? >TAbort-
> <TAbort- <MAbort+ >SERR- <PERR-
> Latency: 0
> Region 0: I/O ports at <ignored>
> Region 1: Memory at <ignored> (32-bit, non-prefetchable)
> Region 2: Memory at <ignored> (32-bit, non-prefetchable)
> Region 3: Memory at <ignored> (32-bit, non-prefetchable)
> Region 4: Memory at <ignored> (32-bit, non-prefetchable)
> Region 5: I/O ports at <ignored>
> Expansion ROM at 0001f800 [disabled] [size=2K]
>
> 00:0f.0 Class 26b4: c483:e910 (rev 90) (prog-if 8d)
> Subsystem: 748d:0026
> Control: I/O- Mem- BusMaster+ SpecCycle+ MemWINV+ VGASnoop+
> ParErr- Stepping- SERR+ FastB2B+
> Status: Cap+ 66Mhz+ UDF+ FastB2B+ ParErr+ DEVSEL=?? >TAbort+
> <TAbort+ <MAbort+ >SERR+ <PERR+
> Latency: 0 (34750ns min, 31000ns max)
> Interrupt: pin P routed to IRQ 0
> Region 0: I/O ports at <ignored> [disabled]
> Region 1: Memory at <ignored> (32-bit, non-prefetchable)
> [disabled]
> Region 2: I/O ports at <ignored> [disabled]
> Region 3: Memory at <ignored> (64-bit, non-prefetchable)
> [disabled]
> Region 5: Memory at <ignored> (32-bit, prefetchable) [disabled]
> Expansion ROM at e000b800 [disabled] [size=2K]
> Capabilities: [fc] #c0 [c8a1]
> Capabilities: [c0] #e8 [f9d4]
> Capabilities: [d8] #85 [2274]
> Capabilities: [c0] #e8 [f9d4]
> Capabilities: [d8] #85 [2274]
> Capabilities: [c0] #e8 [f9d4]
> Capabilities: [d8] #85 [2274]
> Capabilities: [c0] #e8 [f9d4]
> Capabilities: [d8] #85 [2274]
> Capabilities: [c0] #e8 [f9d4]
> Capabilities: [d8] #85 [2274]
> Capabilities: [c0] #e8 [f9d4]
> Capabilities: [d8] #85 [2274]
> C
>
> Listing of /proc/cpuinfo:
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 13
> model name : Intel(R) Pentium(R) M processor 1.73GHz
> stepping : 6
> cpu MHz : 1734.122
> cache size : 64 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 2
> wp : yes
> flags : fpu vme de pse tsc msr mce cx8 apic sep mtrr pge mca
> cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe tm2 est
> bogomips : 3416.06
>
>
> Listing of /proc/ioports and /proc/iomem:
> 0000-001f : dma1
> 0020-0021 : pic1
> 0040-005f : timer
> 0060-006f : keyboard
> 0080-008f : dma page reg
> 00a0-00a1 : pic2
> 00c0-00df : dma2
> 00f0-00ff : fpu
> 01f0-01f7 : ide0
> 03c0-03df : vga+
> 03f6-03f6 : ide0
> c000-efff : PCI Bus #4d
>
> 00000000-0009fbff : System RAM
> 0009fc00-0009ffff : reserved
> 000a0000-000bffff : Video RAM area
> 000c0000-000c7fff : Video ROM
> 000d0000-000d0fff : Adapter ROM
> 000d1000-000d2fff : Adapter ROM
> 000f0000-000fffff : System ROM
> 00100000-1f7cffff : System RAM
> 00100000-0023de35 : Kernel code
> 0023de36-0028a3ff : Kernel data
> 1f7d0000-1f7efbff : reserved
> 1f7efc00-1f7fafff : ACPI Non-volatile Storage
> 1f7fb000-1f7fffff : reserved
> 40800000-8b6fffff : PCI Bus #28
> e0000000-efffffff : reserved
> fec00000-fec01fff : reserved
> fed20000-fed9afff : reserved
> feda0000-fedbffff : reserved
> ffb00000-ffbfffff : reserved
> fff00000-ffffffff : reserved
>
> I don't have /proc/scsi/scsi on my system, nor /proc/modules
>
> Thank you,
> John
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Jonathan Sambrook
Software Developer
Designer Servers
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: PROBLEM: 2.6.7 Linux Kernel Crash While Detecting PCI Devices [ahem]
2004-08-06 18:22 PROBLEM: 2.6.7 Linux Kernel Crash While Detecting PCI Devices John Riggs
2004-08-17 20:22 ` Jonathan Sambrook
@ 2004-08-17 20:55 ` Jonathan Sambrook
1 sibling, 0 replies; 4+ messages in thread
From: Jonathan Sambrook @ 2004-08-17 20:55 UTC (permalink / raw)
To: linux-kernel; +Cc: John Riggs, greg
At 12:22 on Fri 06/08/04, jriggs@altiris.com masquerading as 'John Riggs' wrote:
> Summary: 2.6.7 Linux Kernel Crash While Detecting PCI Devices
This is similar to a problem here. Using kgdb I get the following out of
2.6.8.1:
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 253876k/262064k available (1680k kernel code, 7336k reserved, 1017k data
, 164k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 3022.84 BogoMIPS
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: CLK_CTL MSR was 60031223. Reprogramming to 20031223
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: AMD Athlon(TM) XP 1800+ stepping 01
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
testing NMI watchdog ... OK.
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 1529.0101 MHz.
..... host bus clock speed is 265.0930 MHz.
checking if image is initramfs...spurious 8259A interrupt: IRQ7.
it isn't (no cpio magic); looks like an initrd
Freeing initrd memory: 1793k freed
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xf1b20, last bus=2
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
i8k: not running on a Dell system
i8k: vendor=System Manufacturer, model=System Name, version=ASU
i8k: unable to get SMM Dell signature
i8k: unable to get SMM BIOS version
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
Scanning bus 00
Found 00:00 [10de/01a4] 000600 00
Found 00:01 [10de/01ac] 000500 00
Found 00:02 [10de/01ad] 000500 00
Found 00:03 [10de/01aa] 000500 00
Found 00:08 [10de/01b2] 000601 00
Found 00:09 [10de/01b4] 000c05 00
Found 00:10 [10de/01c2] 000c03 00
Found 00:18 [10de/01c2] 000c03 00
Found 00:20 [10de/01c3] 000200 00
Found 00:28 [10de/01b0] 000401 00
Found 00:30 [10de/01b1] 000401 00
Found 00:40 [10de/01b8] 000604 01
Found 00:48 [10de/01bc] 000101 00
Found 00:f0 [10de/01b7] 000604 01
Fixups for bus 00
Scanning behind PCI bridge 0000:00:08.0, config 010100, pass 0
Scanning bus 01
Found 01:30 [104c/ac50] 000607 02
Fixups for bus 01
Scanning behind PCI bridge 0000:01:06.0, config 000000, pass 0
Scanning behind PCI bridge 0000:01:06.0, config 000000, pass 1
Bus scan for 01 returning with max=05
Scanning behind PCI bridge 0000:00:1e.0, config 020200, pass 0
[New Thread 1]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1]
sysfs_add_file (dir=0x0, attr=0xc02f7a9c) at semaphore.h:115
115 {
(gdb) bt
#0 sysfs_add_file (dir=0x0, attr=0xc02f7a9c) at semaphore.h:115
#1 0xc01e64e4 in class_device_create_file (class_dev=0xcffdbcc0, attr=0x0)
at drivers/base/class.c:171
#2 0xc01c02ad in pci_alloc_child_bus (parent=0x0, bridge=0xcfdfd400, busnr=2)
at drivers/pci/probe.c:299
#3 0xc01c0542 in pci_scan_bridge (bus=0xcffda160, dev=0xcfdfd400, max=5,
pass=0) at drivers/pci/probe.c:368
#4 0xc01c0baa in pci_scan_child_bus (bus=0x0) at drivers/pci/probe.c:718
#5 0xc01c0d2c in pci_scan_bus_parented (parent=0x0, bus=0, ops=0x0,
sysdata=0x0) at drivers/pci/probe.c:790
#6 0xc024acd0 in pcibios_scan_root (busnum=0) at pci.h:702
#7 0xc03c0b79 in pci_legacy_init () at arch/i386/pci/legacy.c:47
#8 0xc03a682c in do_initcalls () at init/main.c:571
#9 0xc010041d in init (unused=0x0) at init/main.c:677
(gdb) p dir
$1 = (struct dentry *) 0x0
(gdb) up
#1 0xc01e64e4 in class_device_create_file (class_dev=0xcffdbcc0, attr=0x0)
at drivers/base/class.c:171
171 error = sysfs_create_file(&class_dev->kobj, &attr->attr);
(gdb) p class_dev
$2 = (struct class_device *) 0xcffdbcc0
(gdb) p *class_dev
$3 = {node = {next = 0xcffdbcc4, prev = 0x30303030}, kobj = {
k_name = 0x32303a <Address 0x32303a out of bounds>,
name = '\0' <repeats 12 times>, "\001\0\0\0Ü??ýÏ", refcount = {
counter = -805454628}, entry = {next = 0xc02f7a34, prev = 0xc032e4e0},
parent = 0x0, kset = 0x0, ktype = 0xc02f7a20, dentry = 0x0}, class = 0x0,
dev = 0x30303030, class_data = 0x32303a,
class_id = '\0' <repeats 12 times>, "Îÿ{ÛNmÎÿ"}
(gdb) up
#2 0xc01c02ad in pci_alloc_child_bus (parent=0x0, bridge=0xcfdfd400, busnr=2)
at drivers/pci/probe.c:299
299 class_device_create_file(&child->class_dev, &class_device_attr_cpuaffinity);
(gdb)
#3 0xc01c0542 in pci_scan_bridge (bus=0xcffda160, dev=0xcfdfd400, max=5,
pass=0) at drivers/pci/probe.c:368
368 child = pci_alloc_child_bus(bus, dev, busnr);
(gdb)
#4 0xc01c0baa in pci_scan_child_bus (bus=0x0) at drivers/pci/probe.c:718
718 max = pci_scan_bridge(bus, dev, max, pass);
(gdb) p bus
$8 = (struct pci_bus *) 0x0
(gdb) up
#5 0xc01c0d2c in pci_scan_bus_parented (parent=0x0, bus=0, ops=0x0,
sysdata=0x0) at drivers/pci/probe.c:790
790 b->subordinate = pci_scan_child_bus(b);
(gdb) p b
$7 = (struct pci_bus *) 0xcffda160
Rebooting into 2.4.27 (which boots but since 2.4.23 doesn't work with
the TI PCI1410 Carbus COntroller) allows me to extract the following
info:
$ lspci
00:00.0 Host bridge: nVidia Corporation nForce CPU bridge (rev b2)
00:00.1 RAM memory: nVidia Corporation nForce 220/420 Memory Controller (rev b2)
00:00.2 RAM memory: nVidia Corporation nForce 220/420 Memory Controller (rev b2)
00:00.3 RAM memory: nVidia Corporation: Unknown device 01aa (rev b2)
00:01.0 ISA bridge: nVidia Corporation nForce ISA Bridge (rev c3)
00:01.1 SMBus: nVidia Corporation nForce PCI System Management (rev c1)
00:02.0 USB Controller: nVidia Corporation nForce USB Controller (rev c3)
00:03.0 USB Controller: nVidia Corporation nForce USB Controller (rev c3)
00:04.0 Ethernet controller: nVidia Corporation nForce Ethernet Controller (rev c2)
00:05.0 Multimedia audio controller: nVidia Corporation: Unknown device 01b0 (rev c2)
00:06.0 Multimedia audio controller: nVidia Corporation nForce Audio (rev c2)
00:08.0 PCI bridge: nVidia Corporation nForce PCI-to-PCI bridge (rev c2)
00:09.0 IDE interface: nVidia Corporation nForce IDE (rev c3)
00:1e.0 PCI bridge: nVidia Corporation nForce AGP to PCI Bridge (rev b2)
01:06.0 CardBus bridge: Texas Instruments PCI1410 PC card Cardbus Controller (rev 01)
02:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 Pro Ultra TF
$ lspci -t
-[00]-+-00.0
+-00.1
+-00.2
+-00.3
+-01.0
+-01.1
+-02.0
+-03.0
+-04.0
+-05.0
+-06.0
+-08.0-[01]----06.0
+-09.0
\-1e.0-[02]----00.0
$ lspci -n
00:00.0 Class 0600: 10de:01a4 (rev b2)
00:00.1 Class 0500: 10de:01ac (rev b2)
00:00.2 Class 0500: 10de:01ad (rev b2)
00:00.3 Class 0500: 10de:01aa (rev b2)
00:01.0 Class 0601: 10de:01b2 (rev c3)
00:01.1 Class 0c05: 10de:01b4 (rev c1)
00:02.0 Class 0c03: 10de:01c2 (rev c3)
00:03.0 Class 0c03: 10de:01c2 (rev c3)
00:04.0 Class 0200: 10de:01c3 (rev c2)
00:05.0 Class 0401: 10de:01b0 (rev c2)
00:06.0 Class 0401: 10de:01b1 (rev c2)
00:08.0 Class 0604: 10de:01b8 (rev c2)
00:09.0 Class 0101: 10de:01bc (rev c3)
00:1e.0 Class 0604: 10de:01b7 (rev b2)
01:06.0 Class 0607: 104c:ac50 (rev 01)
02:00.0 Class 0300: 1002:5446
More debugging possible - what do you want to know?
Regards,
Jonathan
--
Jonathan Sambrook
Software Developer
Designer Servers
--
Jonathan Sambrook
Software Developer
Designer Servers
^ permalink raw reply [flat|nested] 4+ messages in thread