From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Eric Webber" Subject: SMP Issues with IBM X Series 230 eServer Model 61Y Date: Tue, 13 Aug 2002 16:11:17 -0400 Sender: linux-smp-owner@vger.kernel.org Message-ID: <000f01c24305$97cf06e0$4400a8c0@Eric> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: List-Id: Content-Type: text/plain; charset="us-ascii" To: linux-smp@vger.kernel.org Here is our dmesg output. We are using an IBM X Series 230 E-Server Model 61Y. The dmesg output below indicates that there might be SMP [Symmetrical Multiple Processor]issues with our configuration. Do these SMP issues affect us ? We occasionally have system lockup, which is VERY rare for Linux. If we select non-smp option at boot time in GRUB, what are the ramifications, how can we tell if we are hanging because of SMP issues ? Are there any clear symptoms? warmest regards, Eric Sean Webber ewebber@SiliconBeachSystems.com -----Original Message----- From: Eric Sean Webber [mailto:eric@localhost.localdomain] Sent: Tuesday, August 13, 2002 2:16 PM To: ewebber@SiliconBeachSystems.com Subject: Linux version 2.4.7-10smp (bhcompile@stripples.devel.redhat.com) (gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98)) #1 SMP Thu Sep 6 17:09:31 EDT 2001 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009d000 (usable) BIOS-e820: 000000000009d000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fffc300 (usable) BIOS-e820: 000000003fffc300 - 0000000040000000 (ACPI data) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) Scanning bios EBDA for MXT signature 127MB HIGHMEM available. found SMP MP-table at 0009e140 hm, page 0009e000 reserved twice. hm, page 0009f000 reserved twice. hm, page 0009e000 reserved twice. hm, page 0009f000 reserved twice. WARNING: MP table in the EBDA can be UNSAFE, contact linux-smp@vger.kernel.org if you experience SMP problems! On node 0 totalpages: 262140 zone(0): 4096 pages. zone(1): 225280 pages. zone(2): 32764 pages. Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: IBM GNK Product ID: Teton SMP APIC at: 0xFEE00000 Processor #1 Pentium(tm) Pro APIC version 17 Processor #0 Pentium(tm) Pro APIC version 17 I/O APIC #14 Version 17 at 0xFEC00000. I/O APIC #15 Version 17 at 0xFEC01000. Processors: 2 Kernel command line: ro root=/dev/sdb1 Initializing CPU#0 Detected 996.893 MHz processor. Console: colour VGA+ 80x25 Calibrating delay loop... 1985.74 BogoMIPS Memory: 1026736k/1048560k available (1396k kernel code, 20408k reserved, 102k data, 240k init, 131056k highmem) Dentry-cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Mount-cache hash table entries: 16384 (order: 5, 131072 bytes) Buffer-cache hash table entries: 65536 (order: 6, 262144 bytes) Page-cache hash table entries: 262144 (order: 9, 2097152 bytes) CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000 CPU: After generic, caps: 0383fbff 00000000 00000000 00000000 CPU: Common caps: 0383fbff 00000000 00000000 00000000 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX IBM machine detected. Enabling interrupts during APM calls. mtrr: v1.40 (20010327) Richard Gooch (rgooch@atnf.csiro.au) mtrr: detected mtrr type: Intel CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K Intel machine check reporting enabled on CPU#0. CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000 CPU: After generic, caps: 0383fbff 00000000 00000000 00000000 CPU: Common caps: 0383fbff 00000000 00000000 00000000 CPU0: Intel Pentium III (Coppermine) stepping 06 per-CPU timeslice cutoff: 731.68 usecs. enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Booting processor 1/0 eip 2000 Initializing CPU#1 masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 1992.29 BogoMIPS CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K Intel machine check reporting enabled on CPU#1. CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000 CPU: After generic, caps: 0383fbff 00000000 00000000 00000000 CPU: Common caps: 0383fbff 00000000 00000000 00000000 CPU1: Intel Pentium III (Coppermine) stepping 06 Total of 2 processors activated (3978.03 BogoMIPS). ENABLING IO-APIC IRQs ...changing IO-APIC physical APIC ID to 14 ... ok. BIOS bug, IO-APIC#1 ID is 15 in the MPC table!... ... fixing up to 15. (tell your hw vendor) ...changing IO-APIC physical APIC ID to 15 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 14-0, 14-5, 15-2, 15-3, 15-5, 15-6, 15-7, 15-8, 15-9, 15-14, 15-15 not connected. ..TIMER: vector=0x31 pin1=2 pin2=0 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...trying to set up timer (IRQ0) through the 8259A ... ..... (found pin 0) ...works. number of MP IRQ sources: 22. number of IO-APIC #14 registers: 16. number of IO-APIC #15 registers: 16. testing the IO APIC....................... IO APIC #14...... .... register #00: 0E000000 ....... : physical APIC id: 0E .... register #01: 000F0011 ....... : max redirection entries: 000F ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 003 03 0 0 0 0 0 1 1 31 01 003 03 0 0 0 0 0 1 1 39 02 000 00 1 0 0 0 0 0 0 00 03 003 03 0 0 0 0 0 1 1 41 04 003 03 0 0 0 0 0 1 1 49 05 000 00 1 0 0 0 0 0 0 00 06 003 03 0 0 0 0 0 1 1 51 07 003 03 0 0 0 0 0 1 1 59 08 003 03 0 0 0 0 0 1 1 61 09 003 03 1 1 0 1 0 1 1 69 0a 003 03 0 0 0 0 0 1 1 71 0b 003 03 0 0 0 0 0 1 1 79 0c 003 03 0 0 0 0 0 1 1 81 0d 003 03 0 0 0 0 0 1 1 89 0e 003 03 0 0 0 0 0 1 1 91 0f 003 03 0 0 0 0 0 1 1 99 IO APIC #15...... .... register #00: 0F000000 ....... : physical APIC id: 0F .... register #01: 000F0011 ....... : max redirection entries: 000F ....... : IO APIC version: 0011 .... register #02: 0A000000 ....... : arbitration: 0A .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 003 03 1 1 0 1 0 1 1 A1 01 003 03 1 1 0 1 0 1 1 A9 02 000 00 1 0 0 0 0 0 0 00 03 000 00 1 0 0 0 0 0 0 00 04 003 03 1 1 0 1 0 1 1 B1 05 000 00 1 0 0 0 0 0 0 00 06 000 00 1 0 0 0 0 0 0 00 07 000 00 1 0 0 0 0 0 0 00 08 000 00 1 0 0 0 0 0 0 00 09 000 00 1 0 0 0 0 0 0 00 0a 003 03 1 1 0 1 0 1 1 B9 0b 003 03 1 1 0 1 0 1 1 C1 0c 003 03 1 1 0 1 0 1 1 C9 0d 003 03 1 1 0 1 0 1 1 D1 0e 000 00 1 0 0 0 0 0 0 00 0f 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 1:10 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 IRQ16 -> 1:0 IRQ17 -> 1:1 IRQ20 -> 1:4 IRQ27 -> 1:11 IRQ28 -> 1:12 IRQ29 -> 1:13 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 996.7242 MHz. ..... host bus clock speed is 132.8964 MHz. cpu: 0, clocks: 1328964, slice: 442988 CPU0 cpu: 1, clocks: 1328964, slice: 442988 CPU1 checking TSC synchronization across CPUs: passed. mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs PCI: PCI BIOS revision 2.10 entry at 0xfd2fc, last bus=5 PCI: Using configuration type 1 PCI: Probing PCI hardware PCI: Discovered peer bus 02 PCI->APIC IRQ transform: (B0,I2,P0) -> 27 PCI->APIC IRQ transform: (B0,I15,P0) -> 9 PCI->APIC IRQ transform: (B1,I4,P0) -> 16 PCI->APIC IRQ transform: (B1,I5,P0) -> 17 PCI->APIC IRQ transform: (B2,I3,P0) -> 28 PCI->APIC IRQ transform: (B2,I3,P1) -> 29 PCI->APIC IRQ transform: (B2,I5,P0) -> 20 isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Initializing RT netlink socket apm: BIOS not found. mxt_scan_bios: enter Starting kswapd v1.8 allocated 64 pages and 64 bhs reserved for the highmem bounces VFS: Diskquotas version dquot_6.5.0 initialized pty: 2048 Unix98 ptys configured Serial driver version 5.05c (2001-07-08) with MANY_PORTS MULTIPORT SHARE_IRQ SERIAL_PCI ISAPNP enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A Real Time Clock Driver v1.10d block: queued sectors max/low 681482kB/550410kB, 2048 slots per queue RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize Uniform Multi-Platform E-IDE driver Revision: 6.31 ide: Assuming 33MHz PCI bus speed for PIO modes; override with idebus=xx ServerWorks OSB4: IDE controller on PCI bus 00 dev 79 ServerWorks OSB4: chipset revision 0 ServerWorks OSB4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0x0840-0x0847, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0x0848-0x084f, BIOS settings: hdc:pio, hdd:pio hda: LG CD-ROM CRD-8484B, ATAPI CD/DVD-ROM drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ide-floppy driver 0.97 Floppy drive(s): fd0 is 1.44M FDC 0 is a National Semiconductor PC87306 ide-floppy driver 0.97 md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP, IGMP IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) Linux IP multicast router 0.06 plus PIM-SM NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. RAMDISK: Compressed image found at block 0 Freeing initrd memory: 468k freed VFS: Mounted root (ext2 filesystem). SCSI subsystem driver Revision: 1.00 (scsi0) found at PCI 2/3/0 (scsi0) Wide Channel A, SCSI ID=7, 32/255 SCBs (scsi0) Downloading sequencer code... 396 instructions downloaded (scsi1) found at PCI 2/3/1 (scsi1) Wide Channel B, SCSI ID=7, 32/255 SCBs (scsi1) Downloading sequencer code... 396 instructions downloaded scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.4/5.2.0 scsi1 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.4/5.2.0 Vendor: SEAGATE Model: DAT 06240-XXX Rev: 8160 Type: Sequential-Access ANSI SCSI revision: 03 scsi2 : IBM PCI ServeRAID 4.72.00 Vendor: IBM Model: SERVERAID Rev: 1.0 Type: Direct-Access ANSI SCSI revision: 01 Vendor: IBM Model: SERVERAID Rev: 1.0 Type: Direct-Access ANSI SCSI revision: 01 Vendor: IBM Model: SERVERAID Rev: 1.0 Type: Processor ANSI SCSI revision: 01 Vendor: IBM Model: CaVv3 S2 Rev: 0 Type: Processor ANSI SCSI revision: 02 Attached scsi disk sda at scsi2, channel 0, id 0, lun 0 Attached scsi disk sdb at scsi2, channel 0, id 1, lun 0 SCSI device sda: 11464704 512-byte hdwr sectors (5870 MB) Partition check: sda: sda1 SCSI device sdb: 201818112 512-byte hdwr sectors (103331 MB) sdb: sdb1 sdb2 Journalled Block Device driver loaded EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: sd(8,17): orphan cleanup on readonly fs ext3_orphan_cleanup: deleting unreferenced inode 966732 EXT3-fs: sd(8,17): 1 orphan inode deleted EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. Freeing unused kernel memory: 240k freed Adding Swap: 2040244k swap-space (priority -1) usb.c: registered new driver usbdevfs usb.c: registered new driver hub usb-ohci.c: USB OHCI at membase 0xf897b000, IRQ 9 usb-ohci.c: usb-00:0f.2, ServerWorks OSB4/CSB5 OHCI USB Controller usb.c: new USB bus registered, assigned bus number 1 hub.c: USB hub found hub.c: 2 ports detected usb-ohci.c: v5.2:USB OHCI Host Controller Driver EXT3 FS 2.4-0.9.8, 25 Aug 2001 on sd(8,17), internal journal kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.8, 25 Aug 2001 on sd(8,1), internal journal EXT3-fs: mounted filesystem with ordered data mode. st: Version 20010812, bufsize 32768, wrt 30720, max init. bufs 4, s/g segs 16 Attached scsi tape st0 at scsi0, channel 0, id 2, lun 0 parport0: PC-style at 0x378 [PCSPP] parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) ip_conntrack (8191 buckets, 65528 max) eepro100.c:v1.09j-t 9/29/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin and others eth0: Intel Corporation 82557 [Ethernet Pro 100], 00:03:47:42:3B:29, IRQ 16. Receiver lock-up bug exists -- enabling work-around. Board assembly 711269-006, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eth1: Intel Corporation 82557 [Ethernet Pro 100] (#2), 00:03:47:42:3B:2A, IRQ 17. Receiver lock-up bug exists -- enabling work-around. Board assembly 711269-006, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. pcnet32_probe_pci: found device 0x001022.0x002000 ioaddr=0x002000 resource_flags=0x000101 eth%d: PCnet/FAST III 79C975 at 0x2000, 00 02 55 91 f3 9f pcnet32: pcnet32_private lp=d50fe000 lp_dma_addr=0x150fe000 assigned IRQ 27. pcnet32.c:v1.25kf 26.9.1999 tsbogend@alpha.franken.de mtrr: your processor doesn't support write-combining