From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Kofler Subject: Re: Aieee! CPU0 is toast... Date: Tue, 30 Aug 2005 15:32:32 +0200 Message-ID: <1125408752.43145ff04a637@mail.devcon.cc> References: <1125406535.431457479958d@mail.devcon.cc> <8e49d9a63629fddb747d4ab8ffcc5cd3@cl.cam.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <8e49d9a63629fddb747d4ab8ffcc5cd3@cl.cam.ac.uk> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Zitat von Keir Fraser : > Can you give full boot log of Xen crash? How quickly does the crash=20 > occur? We have no serial device, the Xen kernel talks about some kind=20 of "clearing/preparing" of the memory with a status bar (....), finishing= =20 this, works on and then maybe half a second later the kernel crashes. >=20 > If you cannot get a serial log from the Xen boot, perhaps you can send=20 > the dmesg output from booting native Linux on the box. >=20 Native booting: Linux version 2.6.12-1.1398_FC4smp (bhcompile@tweety.build.redhat.com) (g= cc=20 version 4.0.0 20050519 (Red Hat 4.0.0-8)) #1 SMP Fri Jul 15 01:30:13 EDT = 2005 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009c000 (usable) BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000005fffb000 (usable) BIOS-e820: 000000005fffb000 - 0000000060000000 (ACPI data) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) 639MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 0009c1d0 Using x86 segment limits to approximate NX protection On node 0 totalpages: 393211 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:31 HighMem zone: 163835 pages, LIFO batch:31 DMI 2.1 present. Using APIC driver default ACPI: RSDP (v000 IBM ) @ 0x000fdfd0 ACPI: RSDT (v001 IBM SERKQUAD 0x00001000 IBM 0x00000000) @ 0x5fffff80 ACPI: FADT (v001 IBM SERKQUAD 0x00001000 IBM 0x00000000) @ 0x5fffff00 ACPI: MADT (v001 IBM SERKQUAD 0x00001000 IBM 0x00000000) @ 0x5ffffe80 ACPI: DSDT (v001 IBM SERKQUAD 0x00001000 MSFT 0x0100000b) @ 0x00000000 ACPI: BIOS age (2000) fails cutoff (2001), acpi=3Dforce is required to en= able=20 ACPI ACPI: Disabling ACPI support Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: IBM ENSW Product ID: NF 5500 SMP APIC at: 0xFEE00000 Processor #3 6:7 APIC version 17 Processor #0 6:7 APIC version 17 I/O APIC #14 Version 17 at 0xFEC00000. Enabling APIC mode: Flat. Using 1 I/O APICs Processors: 2 Allocating PCI resources starting at 60000000 (gap: 60000000:9ec00000) Built 1 zonelists Kernel command line: ro root=3D/dev/VolGroup00/LogVol00 mapped APIC to ffffd000 (fee00000) mapped IOAPIC to ffffc000 (fec00000) Initializing CPU#0 CPU 0 irqstacks, hard=3Dc0428000 soft=3Dc0408000 PID hash table entries: 4096 (order: 12, 65536 bytes) Detected 547.719 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 1551812k/1572844k available (2078k kernel code, 19776k reserved, = 768k=20 data, 232k init, 655340k highmem) Checking if this processor honours the WP bit even in supervisor mode... = Ok. Calibrating delay loop... 1081.34 BogoMIPS (lpj=3D540672) Security Framework v1.0.0 initialized SELinux: Initializing. SELinux: Starting in permissive mode selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000=20 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 000= 00000=20 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K CPU: After all inits, caps: 0383f3ff 00000000 00000000 00000040 00000000=20 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. CPU0: Intel Pentium III (Katmai) stepping 03 Booting processor 1/0 eip 3000 CPU 1 irqstacks, hard=3Dc0429000 soft=3Dc0409000 Initializing CPU#1 Calibrating delay loop... 1093.63 BogoMIPS (lpj=3D546816) CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000=20 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 000= 00000=20 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K CPU: After all inits, caps: 0383f3ff 00000000 00000000 00000040 00000000=20 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel Pentium III (Katmai) stepping 03 Total of 2 processors activated (2174.97 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=3D0x31 pin1=3D2 pin2=3D-1 checking TSC synchronization across 2 CPUs: passed. Brought up 2 CPUs checking if image is initramfs... it is Freeing initrd memory: 1851k freed NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfd4fc, last bus=3D5 PCI: Using configuration type 1 mtrr: v2.0 (20020519) mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. ACPI: Subsystem revision 20050309 ACPI: Interpreter disabled. Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI: disabled usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) Boot video device is 0000:00:0f.0 PCI: Discovered peer bus 01 PCI->APIC IRQ transform: 0000:00:03.0[A] -> IRQ 11 PCI->APIC IRQ transform: 0000:00:0e.0[A] -> IRQ 10 PCI->APIC IRQ transform: 0000:00:0f.0[A] -> IRQ 9 PCI->APIC IRQ transform: 0000:00:13.2[D] -> IRQ 10 PCI->APIC IRQ transform: 0000:01:01.0[A] -> IRQ 11 PCI->APIC IRQ transform: 0000:01:02.0[A] -> IRQ 9 IBM machine detected. Enabling interrupts during APM calls. apm: BIOS not found. audit: initializing netlink socket (disabled) audit(1125352844.768:1): initialized highmem bounce pool size: 64 pages Total HugeTLB memory allocated, 0 VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) SELinux: Registering netfilter hooks Initializing Cryptographic API ksign: Installing public key data Loading keyring - Added public key 57652F2F3358E32D - User ID: Red Hat, Inc. (Kernel Module GPG key) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Real Time Clock Driver v1.12 Linux agpgart interface v0.101 (c) Dave Jones agpgart: unable to determine aperture size. agpgart: agp_backend_initialize() failed. agpgart-serverworks: probe of 0000:00:00.0 failed with error -22 agpgart: unable to determine aperture size. agpgart: agp_backend_initialize() failed. agpgart-serverworks: probe of 0000:00:11.0 failed with error -22 PNP: No PS/2 controller found. Probing ports directly. serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 76 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq =3D 4) is a 16550A ttyS1 at I/O 0x2f8 (irq =3D 3) is a 16550A io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=3D= xx PIIX4: IDE controller at PCI slot 0000:00:13.1 PIIX4: chipset revision 1 PIIX4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio Probing IDE interface ide0... hda: CRD-8400B, ATAPI CD/DVD-ROM drive hda: Disabling (U)DMA for CRD-8400B (blacklisted) ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hda: ATAPI 40X CD-ROM drive, 128kB Cache Uniform CD-ROM driver Revision: 3.20 ide-floppy driver 0.99.newide usbcore: registered new driver hiddev usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.01:USB HID core driver mice: PS/2 mouse device common for all mice md: md driver 0.90.1 MAX_MD_DEVS=3D256, MD_SB_DISKS=3D27 NET: Registered protocol family 2 IP: routing cache hash table of 8192 buckets, 128Kbytes TCP established hash table entries: 262144 (order: 10, 4194304 bytes) TCP bind hash table entries: 65536 (order: 7, 786432 bytes) TCP: Hash tables configured (established 262144 bind 65536) Initializing IPsec netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 CPU0 attaching sched-domain: domain 0: span 00000001 groups: 00000001 domain 1: span 00000003 groups: 00000001 00000002 CPU1 attaching sched-domain: domain 0: span 00000002 groups: 00000002 domain 1: span 00000003 groups: 00000002 00000001 Freeing unused kernel memory: 232k freed input: AT Translated Set 2 keyboard on isa0060/serio0 SCSI subsystem initialized ips 0000:00:03.0: Warning ! ! ! ServeRAID Version Mismatch ips 0000:00:03.0: Bios =3D 7.00.14, Firmware =3D 2.88.13, Device Driver =3D= 7.10.18=20 ips 0000:00:03.0: These levels should match to avoid possible compatibili= ty=20 problems. logips2pp: Detected unknown logitech mouse model 1 ips 0000:01:02.0: Warning ! ! ! ServeRAID Version Mismatch ips 0000:01:02.0: Bios =3D 7.00.14, Firmware =3D 6.10.24, Device Driver =3D= 7.10.18=20 ips 0000:01:02.0: These levels should match to avoid possible compatibili= ty=20 problems. scsi0 : IBM PCI ServeRAID 7.10.18 Build 731 Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Processor ANSI SCSI revision: 02 scsi1 : IBM PCI ServeRAID 7.10.18 Build 731 Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Direct-Access ANSI SCSI revision: 02 SCSI device sda: 71094272 512-byte hdwr sectors (36400 MB) SCSI device sda: drive cache: write through SCSI device sda: 71094272 512-byte hdwr sectors (36400 MB) SCSI device sda: drive cache: write through sda: sda1 sda2 Attached scsi disk sda at scsi1, channel 0, id 0, lun 0 Vendor: IBM Model: SERVERAID Rev: 1.00 Type: Processor ANSI SCSI revision: 02 Vendor: SDR Model: GEM200 Rev: 2 =20 Type: Processor ANSI SCSI revision: 02 input: PS/2 Logitech Mouse on isa0060/serio1 QLogic Fibre Channel HBA Driver qla2200 0000:01:01.0: Found an ISP2200, irq 11, iobase 0xf881c000 qla2200 0000:01:01.0: Configuring PCI space... qla2200 0000:01:01.0: Configure NVRAM parameters... qla2200 0000:01:01.0: Verifying loaded RISC code... qla2200 0000:01:01.0: LIP reset occured (f8f7). qla2200 0000:01:01.0: Waiting for LIP to complete... qla2200 0000:01:01.0: LIP occured (f8f7). qla2200 0000:01:01.0: LOOP UP detected (1 Gbps). qla2200 0000:01:01.0: Topology - (Loop), Host Loop address 0x7d scsi2 : qla2xxx qla2200 0000:01:01.0:=20 QLogic Fibre Channel HBA Driver: 8.00.02b5-k QLogic QLA22xx -=20 ISP2200: PCI (33 MHz) @ 0000:01:01.0 hdma+, host#=3D2, fw=3D2.02.06 TP Vendor: IBM Model: 3526 Rev: 0401 Type: Direct-Access ANSI SCSI revision: 03 Vendor: IBM Model: 3526 Rev: 0401 Type: Direct-Access ANSI SCSI revision: 03 device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel@redhat.com SCSI device sdb: 1132462080 512-byte hdwr sectors (579821 MB) SCSI device sdb: drive cache: write back SCSI device sdb: 1132462080 512-byte hdwr sectors (579821 MB) SCSI device sdb: drive cache: write back sdb: unknown partition table Attached scsi disk sdb at scsi2, channel 0, id 0, lun 5 Vendor: IBM Model: 3526 Rev: 0401 Type: Direct-Access ANSI SCSI revision: 03 SCSI device sdc: 318767104 512-byte hdwr sectors (163209 MB) SCSI device sdc: drive cache: write back SCSI device sdc: 318767104 512-byte hdwr sectors (163209 MB) SCSI device sdc: drive cache: write back sdc: unknown partition table Attached scsi disk sdc at scsi2, channel 0, id 0, lun 6 Vendor: IBM Model: 3526 Rev: 0401 Type: Direct-Access ANSI SCSI revision: 03 SCSI device sdd: 283115520 512-byte hdwr sectors (144955 MB) SCSI device sdd: drive cache: write back SCSI device sdd: 283115520 512-byte hdwr sectors (144955 MB) SCSI device sdd: drive cache: write back sdd: unknown partition table Attached scsi disk sdd at scsi2, channel 0, id 0, lun 7 Vendor: IBM Model: Universal Xport Rev: 0401 Type: Direct-Access ANSI SCSI revision: 03 SCSI device sde: 40960 512-byte hdwr sectors (21 MB) SCSI device sde: drive cache: write through SCSI device sde: 40960 512-byte hdwr sectors (21 MB) SCSI device sde: drive cache: write through sde: unknown partition table Attached scsi disk sde at scsi2, channel 0, id 0, lun 31 kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. cfq: depth 4 reached, tagging now on SELinux: Disabled at runtime. SELinux: Unregistering netfilter hooks Attached scsi generic sg0 at scsi0, channel 0, id 15, lun 0, type 3 Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0, type 0 Attached scsi generic sg2 at scsi1, channel 0, id 15, lun 0, type 3 Attached scsi generic sg3 at scsi1, channel 3, id 15, lun 0, type 3 Attached scsi generic sg4 at scsi2, channel 0, id 0, lun 0, type 0 Attached scsi generic sg5 at scsi2, channel 0, id 0, lun 5, type 0 Attached scsi generic sg6 at scsi2, channel 0, id 0, lun 6, type 0 Attached scsi generic sg7 at scsi2, channel 0, id 0, lun 7, type 0 Attached scsi generic sg8 at scsi2, channel 0, id 0, lun 31, type 0 Floppy drive(s): fd0 is 1.44M FDC 0 is a National Semiconductor PC87306 pcnet32.c:v1.30j 29.04.2005 tsbogend@alpha.franken.de pcnet32: PCnet/FAST 79C971 at 0x2180, 00 06 29 38 4f 39 tx_start_pt(0x0c00):~220 bytes, BCR18(9861):BurstWrEn BurstRdEn NoUFl= ow=20 SRAMSIZE=3D0x7f00, SRAM_BND=3D0x0800, assigned IRQ 10. eth0: registered as PCnet/FAST 79C971 pcnet32: 1 cards_found. piix4_smbus 0000:00:13.3: Found 0000:00:13.3 device piix4_smbus 0000:00:13.3: IBM Laptop detected; this module may corrupt yo= ur=20 serial eeprom! Refusing to load module! piix4_smbus: probe of 0000:00:13.3 failed with error -1 USB Universal Host Controller Interface driver v2.2 uhci_hcd 0000:00:13.2: UHCI Host Controller uhci_hcd 0000:00:13.2: new USB bus registered, assigned bus number 1 uhci_hcd 0000:00:13.2: irq 10, io base 0x0000ff00 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. EXT3 FS on dm-0, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 2031608k swap on /dev/VolGroup00/LogVol01. Priority:-1 extents:1 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 Bluetooth: Core ver 2.7 NET: Registered protocol family 31 Bluetooth: HCI device and connection manager initialized Bluetooth: HCI socket layer initialized Bluetooth: L2CAP ver 2.7 Bluetooth: L2CAP socket layer initialized Bluetooth: RFCOMM ver 1.5 Bluetooth: RFCOMM socket layer initialized Bluetooth: RFCOMM TTY layer initialized parport0: PC-style at 0x378 [PCSPP] lp0: using parport0 (polling). lp0: console ready NET: Registered protocol family 10 Disabled Privacy Extensions on device c037b660(lo) IPv6 over IPv4 tunneling driver eth0: no IPv6 routers present qla2200 0000:01:01.0: LIP reset occured (f8e1). Debug: sleeping function called from invalid context at=20 include/linux/rwsem.h:43 in_atomic():1, irqs_disabled():1 [] target_block+0x0/0x23 [scsi_mod] [] device_for_each_child+0x1d/0x6f [] fc_remote_port_block+0x27/0x4c [scsi_transport_fc] [] qla2x00_mark_all_devices_lost+0x54/0x58 [qla2xxx] [] qla2x00_async_event+0x63b/0xa92 [qla2xxx] [] ips_next+0x185/0x6ac [ips] [] cdrom_timer_expiry+0x0/0x57 [] elv_queue_empty+0x12/0x1b [] ide_do_request+0xb1/0x3a6 [] qla2100_intr_handler+0x13e/0x19c [qla2xxx] [] handle_IRQ_event+0x2e/0x5a [] __do_IRQ+0xbf/0x117 [] do_IRQ+0x4e/0x86 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D [] smp_apic_timer_interrupt+0xcc/0xce [] common_interrupt+0x1a/0x20 [] default_idle+0x0/0x29 [] default_idle+0x26/0x29 [] cpu_idle+0x4e/0x63 [] start_kernel+0x176/0x1cd [] unknown_bootoption+0x0/0x1b6 qla2200 0000:01:01.0: LOOP DOWN detected. rport-2:0-2: blocked FC remote port time out: removing target Synchronizing SCSI cache for disk sdb:=20 FAILED status =3D 0, message =3D 00, host =3D 1, driver =3D 00 <5>Synchronizing SCSI cache for disk sdc:=20 FAILED status =3D 0, message =3D 00, host =3D 1, driver =3D 00 <5>Synchronizing SCSI cache for disk sdd:=20 FAILED status =3D 0, message =3D 00, host =3D 1, driver =3D 00 <4>qla2200 0000:01:01.0: Loop down - aborting ISP. qla2200 0000:01:01.0: Performing ISP error recovery - ha=3D c1fec220. qla2200 0000:01:01.0: Cable is unplugged... =20 =20