From mboxrd@z Thu Jan 1 00:00:00 1970 From: PATROL Quality Assurance Account Subject: (unknown) Date: Wed, 19 Jun 2002 05:08:40 -0500 Sender: linux-smp-owner@vger.kernel.org Message-ID: <200206191008.g5JA8eP07291@vml3.bmc.com> Return-path: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-smp@vger.kernel.org As you will see in the text below (dmesg.out), there is a request to send this to you ladies and gentlemen. If you figure anything out, or if I can provide you with any further information, please email me: pstalder@bmc.com This is the architecture of the machine: Dual PIII - 866 1024 MB RAM ABIT VP6II Motherboard 2 - 40GB WD Caviar 7200 RPM Drives (Software RAID 0) 3com 905b ethernet card MATROX Millenium graphics card Creative Labs 52X cdrom Sony 1.44 MB floppy Pretty basic stuff, really. The system crashes at a command prompt with no X running, or will run for several days with 100% CPU usage, and over 100 applications running. We had a power outage a few weeks ago, and I suspect that has something to do with this particular problem. The other 3 very rarely go down (one of them has not crashed, yet). I hope this helps you, and I hope you will be able to help me. I have not found a diagnostic tool yet that can figure this one out. Paul Stalder 0000100000 (usable) BIOS-e820: 000000000000d000 @ 000000003fff3000 (ACPI data) BIOS-e820: 0000000000003000 @ 000000003fff0000 (ACPI NVS) 127MB HIGHMEM available. hm, page 00001000 reserved twice. Scan SMP from c0000000 for 1024 bytes. Scan SMP from c009fc00 for 1024 bytes. Scan SMP from c00f0000 for 65536 bytes. found SMP MP-table at 000f5770 hm, page 000f5000 reserved twice. hm, page 000f6000 reserved twice. hm, page 000f7000 reserved twice. hm, page 000f1000 reserved twice. hm, page 000f2000 reserved twice. hm, page 000f3000 reserved twice. On node 0 totalpages: 262128 zone(0): 4096 pages. zone DMA has max 32 cached pages. zone(1): 225280 pages. zone Normal has max 1024 cached pages. zone(2): 32752 pages. zone HighMem has max 255 cached pages. Intel MultiProcessor Specification v1.1 Virtual Wire compatibility mode. OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000 Processor #0 Pentium(tm) Pro APIC version 17 Floating point unit present. Machine Exception supported. 64 bit compare & exchange supported. Internal APIC present. SEP present. MTRR present. PGE present. MCA present. CMOV present. Bootup CPU Processor #1 Pentium(tm) Pro APIC version 17 Floating point unit present. Machine Exception supported. 64 bit compare & exchange supported. Internal APIC present. SEP present. MTRR present. PGE present. MCA present. CMOV present. Bus #0 is PCI Bus #1 is PCI Bus #2 is ISA I/O APIC #2 Version 17 at 0xFEC00000. Int: type 3, pol 0, trig 0, bus 2, IRQ 00, APIC ID 2, APIC INT 00 Int: type 0, pol 0, trig 0, bus 2, IRQ 01, APIC ID 2, APIC INT 01 Int: type 0, pol 0, trig 0, bus 2, IRQ 00, APIC ID 2, APIC INT 02 Int: type 0, pol 0, trig 0, bus 2, IRQ 03, APIC ID 2, APIC INT 03 Int: type 0, pol 0, trig 0, bus 2, IRQ 04, APIC ID 2, APIC INT 04 Int: type 0, pol 0, trig 0, bus 2, IRQ 05, APIC ID 2, APIC INT 05 Int: type 0, pol 0, trig 0, bus 2, IRQ 06, APIC ID 2, APIC INT 06 Int: type 0, pol 0, trig 0, bus 2, IRQ 07, APIC ID 2, APIC INT 07 Int: type 0, pol 1, trig 1, bus 2, IRQ 08, APIC ID 2, APIC INT 08 Int: type 0, pol 0, trig 0, bus 2, IRQ 09, APIC ID 2, APIC INT 09 Int: type 0, pol 0, trig 0, bus 2, IRQ 0c, APIC ID 2, APIC INT 0c Int: type 0, pol 0, trig 0, bus 2, IRQ 0d, APIC ID 2, APIC INT 0d Int: type 0, pol 0, trig 0, bus 2, IRQ 0e, APIC ID 2, APIC INT 0e Int: type 0, pol 0, trig 0, bus 2, IRQ 0f, APIC ID 2, APIC INT 0f Int: type 0, pol 3, trig 3, bus 2, IRQ 0b, APIC ID 2, APIC INT 0b Int: type 0, pol 3, trig 3, bus 2, IRQ 0a, APIC ID 2, APIC INT 0a Lint: type 3, pol 0, trig 0, bus 2, IRQ 00, APIC ID ff, APIC LINT 00 Lint: type 1, pol 0, trig 0, bus 2, IRQ 00, APIC ID ff, APIC LINT 01 Processors: 2 mapped APIC to ffffe000 (fee00000) mapped IOAPIC to ffffd000 (fec00000) hm, page 01000000 reserved twice. Kernel command line: auto BOOT_IMAGE=linux ro root=301 BOOT_FILE=/boot/vmlinuz-2.4.2-2smp Initializing CPU#0 Detected 865.248 MHz processor. Console: colour VGA+ 80x25 Calibrating delay loop... 1723.59 BogoMIPS Memory: 1027868k/1048512k available (1500k kernel code, 20256k reserved, 103k data, 252k init, 131008k highmem) Dentry-cache hash table entries: 131072 (order: 8, 1048576 bytes) Buffer-cache hash table entries: 65536 (order: 6, 262144 bytes) Page-cache hash table entries: 262144 (order: 9, 2097152 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) VFS: Diskquotas version dquot_6.5.0 initialized CPU: Before vendor init, caps: 0387fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: After vendor init, caps: 0387fbff 00000000 00000000 00000000 CPU serial number disabled. CPU: After generic, caps: 0383fbff 00000000 00000000 00000000 CPU: Common caps: 0383fbff 00000000 00000000 00000000 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX mtrr: v1.37 (20001109) Richard Gooch (rgooch@atnf.csiro.au) mtrr: detected mtrr type: Intel CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K Intel machine check reporting enabled on CPU#0. CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000 CPU: After generic, caps: 0383fbff 00000000 00000000 00000000 CPU: Common caps: 0383fbff 00000000 00000000 00000000 CPU0: Intel Pentium III (Coppermine) stepping 03 per-CPU timeslice cutoff: 730.79 usecs. Getting VERSION: 40011 Getting VERSION: 40011 Getting ID: 0 Getting ID: f000000 Getting LVT0: 700 Getting LVT1: 400 enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 CPU present map: 3 Booting processor 1/1 eip 3000 Setting warm reset code and vector. 1. 2. 3. Asserting INIT. Waiting for send to finish... +Deasserting INIT. Waiting for send to finish... +#startup loops: 2. Sending STARTUP #1. After apic_write. Initializing CPU#1 CPU#1 (phys ID: 1) waiting for CALLOUT Startup point 1. Waiting for send to finish... +Sending STARTUP #2. After apic_write. Startup point 1. Waiting for send to finish... +After Startup. Before Callout 1. After Callout 1. CALLIN, before setup_local_APIC(). masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 1730.15 BogoMIPS Stack at about c212dfb8 CPU: Before vendor init, caps: 0387fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K Intel machine check reporting enabled on CPU#1. CPU: After vendor init, caps: 0387fbff 00000000 00000000 00000000 CPU serial number disabled. CPU: After generic, caps: 0383fbff 00000000 00000000 00000000 CPU: Common caps: 0383fbff 00000000 00000000 00000000 OK. CPU1: Intel Pentium III (Coppermine) stepping 03 CPU has booted. Before bogomips. Total of 2 processors activated (3453.74 BogoMIPS). Before bogocount - setting activated=1. Boot done. ENABLING IO-APIC IRQs ...changing IO-APIC physical APIC ID to 2 ... ok. Synchronizing Arb IDs. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected. ..TIMER: vector=49 pin1=2 pin2=0 number of MP IRQ sources: 16. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 .... register #01: 00178011 ....... : max redirection entries: 0017 ....... : IO APIC version: 0011 WARNING: unexpected IO-APIC, please mail to linux-smp@vger.kernel.org .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 003 03 0 0 0 0 0 1 1 39 02 003 03 0 0 0 0 0 1 1 31 03 003 03 0 0 0 0 0 1 1 41 04 003 03 0 0 0 0 0 1 1 49 05 003 03 0 0 0 0 0 1 1 51 06 003 03 0 0 0 0 0 1 1 59 07 003 03 0 0 0 0 0 1 1 61 08 003 03 0 0 0 0 0 1 1 69 09 003 03 0 0 0 0 0 1 1 71 0a 003 03 1 1 0 1 0 1 1 79 0b 003 03 1 1 0 1 0 1 1 81 0c 003 03 0 0 0 0 0 1 1 89 0d 003 03 0 0 0 0 0 1 1 91 0e 003 03 0 0 0 0 0 1 1 99 0f 003 03 0 0 0 0 0 1 1 A1 10 000 00 1 0 0 0 0 0 0 00 11 000 00 1 0 0 0 0 0 0 00 12 000 00 1 0 0 0 0 0 0 00 13 000 00 1 0 0 0 0 0 0 00 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 0:5 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 .................................... done. calibrating APIC timer ... ..... CPU clock speed is 865.2644 MHz. ..... host bus clock speed is 133.1174 MHz. cpu: 0, clocks: 1331174, slice: 443724 CPU0 cpu: 1, clocks: 1331174, slice: 443724 CPU1 checking TSC synchronization across CPUs: passed. Setting commenced=1, go go go mtrr: your CPUs had inconsistent variable MTRR settings mtrr: probably your BIOS does not setup all CPUs PCI: PCI BIOS revision 2.10 entry at 0xfb3a0, last bus=1 PCI: Using configuration type 1 PCI: Probing PCI hardware Unknown bridge resource 0: assuming transparent Unknown bridge resource 1: assuming transparent Unknown bridge resource 2: assuming transparent PCI: Using IRQ router VIA [1106/0686] at 00:07.0 isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Initializing RT netlink socket apm: BIOS version 1.2 Flags 0x07 (Driver version 1.14) apm: disabled - APM is not SMP safe. Starting kswapd v1.8 pty: 256 Unix98 ptys configured block: queued sectors max/low 682541kB/551469kB, 2048 slots per queue RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize Uniform Multi-Platform E-IDE driver Revision: 6.31 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller on PCI bus 00 dev 39 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:07.1 ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:pio, hdd:pio hda: WDC WD400BB-00AUA1, ATA DISK drive hdc: WDC WD400BB-00AUA1, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ide1 at 0x170-0x177,0x376 on irq 15 hda: 78165360 sectors (40021 MB) w/2048KiB Cache, CHS=4865/255/63, UDMA(33) hdc: 78165360 sectors (40021 MB) w/2048KiB Cache, CHS=77545/16/63, UDMA(33) Partition check: hda: hda1 hda2 < hda5 hda6 > hdc: [PTBL] [4865/255/63] hdc1 hdc2 < hdc5 > Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 RAMDISK: Compressed image found at block 0 Freeing initrd memory: 248k freed Serial driver version 5.02 (2000-08-09) with MANY_PORTS MULTIPORT SHARE_IRQ SERIAL_PCI ISAPNP enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A Real Time Clock Driver v1.10d md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 md.c: sizeof(mdp_super_t) = 4096 autodetecting RAID arrays (read) hda5's sb offset: 36869056 [events: 00000013] (read) hdc1's sb offset: 36869056 [events: 00000013] autorun ... considering hdc1 ... adding hdc1 ... adding hda5 ... created md0 bind bind running: hdc1's event counter: 00000013 hda5's event counter: 00000013 request_module[md-personality-2]: Root fs not mounted md.c: personality 2 is not loaded! do_md_run() returned -22 md0 stopped. unbind export_rdev(hdc1) unbind export_rdev(hda5) ... autorun DONE. NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP, IGMP IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) Linux IP multicast router 0.06 plus PIM-SM NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. VFS: Mounted root (ext2 filesystem). raid0 personality registered as nr 2 autodetecting RAID arrays (read) hdc1's sb offset: 36869056 [events: 00000013] (read) hda5's sb offset: 36869056 [events: 00000013] autorun ... considering hda5 ... adding hda5 ... adding hdc1 ... created md0 bind bind running: hda5's event counter: 00000013 hdc1's event counter: 00000013 md0: max total readahead window set to 2048k md0: 2 data-disks, max readahead per data-disk: 1024k raid0: looking at hdc1 raid0: comparing hdc1(36869056) with hdc1(36869056) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at hda5 raid0: comparing hda5(36869056) with hdc1(36869056) raid0: EQUAL raid0: FINAL 1 zones zone 0 checking hdc1 ... contained as device 0 (36869056) is smallest!. checking hda5 ... contained as device 1 zone->nb_dev: 2, size: 73738112 current zone offset: 36869056 done. raid0 : md_size is 73738112 blocks. raid0 : conf->smallest->size is 73738112 blocks. raid0 : nb_zone is 1. raid0 : Allocating 8 bytes for hash. md: updating md0 RAID superblock on device hda5 [events: 00000014](write) hda5's sb offset: 36869056 hdc1 [events: 00000014](write) hdc1's sb offset: 36869056 . ... autorun DONE. VFS: Mounted root (ext2 filesystem) readonly. change_root: old root has d_count=3 Trying to unmount old root ... okay Freeing unused kernel memory: 252k freed Adding Swap: 1542200k swap-space (priority -1) Winbond Super-IO detection, now testing ports 3F0,370,250,4E,2E ... SMSC Super-IO detection, now testing Ports 2F0, 370 ... parport0: PC-style at 0x378, irq 7 [PCSPP,EPP] parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport_pc: Via 686A parallel port: io=0x378, irq=7 ip_conntrack (8191 buckets, 65528 max) 3c59x.c:LK1.1.13 27 Jan 2001 Donald Becker and others. http://www.scyld.com/network/vortex.html See Documentation/networking/vortex.txt eth0: 3Com PCI 3c905C Tornado at 0xec00, 00:01:03:1f:6a:c5, IRQ 11 product code 484e rev 00.3 date 09-16-00 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface. MII transceiver found at address 24, status 782d. Enabling bus-master transmits and whole-frame receives. eth0: scatter/gather disabled. h/w checksums enabled eth0: using NWAY device table, not 8 eth0: using NWAY device table, not 8 APIC error on CPU1: 00(08) APIC error on CPU0: 00(04) /dev/vmmon: Module vmmon: registered with major=10 minor=165 tag=$Name: build-1790 $ /dev/vmmon: Module vmmon: initialized Winbond Super-IO detection, now testing ports 3F0,370,250,4E,2E ... SMSC Super-IO detection, now testing Ports 2F0, 370 ... parport0: PC-style at 0x378, irq 7 [PCSPP,EPP] parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport_pc: Via 686A parallel port: io=0x378, irq=7 /dev/vmnet: open called by PID 688 (vmnet-bridge) /dev/vmnet: hub 0 does not exist, allocating memory. /dev/vmnet: port on hub 0 successfully opened bridge-eth0: up bridge-eth0: attached /dev/vmnet: open called by PID 706 (vmnet-natd) /dev/vmnet: hub 8 does not exist, allocating memory. /dev/vmnet: port on hub 8 successfully opened APIC error on CPU0: 04(01) APIC error on CPU0: 01(04) APIC error on CPU1: 08(02) APIC error on CPU0: 04(04) eth0: Too much work in interrupt, status e401. APIC error on CPU0: 04(01) APIC error on CPU1: 02(02)