* Re: random freezes B2000 running debian hppa lenny [not found] <49FB108B.9030803@ieee.org> @ 2009-05-03 11:25 ` Grant Grundler 2009-05-03 23:07 ` Dirk Van Hertem 2009-05-15 22:40 ` Dirk Van Hertem 0 siblings, 2 replies; 6+ messages in thread From: Grant Grundler @ 2009-05-03 11:25 UTC (permalink / raw) To: Dirk Van Hertem; +Cc: linux-parisc [ moved debian-hppa to BCC and added linux-parisc to CC ] On Fri, May 01, 2009 at 05:08:59PM +0200, Dirk Van Hertem wrote: > hello, > > My hppa box (B2000) experiences some problems: it freezes after a few > (2-6) hours. > > On the led display I get the following error codes: > > FLT CBFC: SYS BD > bus timeout > OS HPMC bz err > Bad OS HPMC len > HPMC initiated Hi Dirk, Given the PCI listing you gave below, I agree the HPMC is likely caused by the Promise SATA card. > I don't have screen nor keyboard attached to it so debugging is a bit > difficult. AFAIK, the only way to debug this is to capture the HPMC dump. The HPMC dump can only be capture via serial console. :( (ie run "ser pim" at PDC prompt from a terminal emulator like minicom) > > System: > Debian lenny (stable), rather clean install > > $ uname -a > Linux coulomb 2.6.26-2-parisc #1 Fri Mar 27 03:29:17 UTC 2009 parisc > GNU/Linux > > > The machines has the following lspci: > > 00:0c.0 Ethernet controller: Digital Equipment Corporation DECchip > 21142/43 (rev 41) > 00:0d.0 Multimedia audio controller: Analog Devices AD1889 sound chip > 00:0e.0 IDE interface: National Semiconductor Corporation 87415/87560 > IDE (rev 03) > 00:0e.1 Bridge: National Semiconductor Corporation 87560 Legacy I/O (rev 01) > 00:0e.2 USB Controller: National Semiconductor Corporation USB > Controller (rev 02) > 00:0f.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895a (rev 01) > 01:00.0 3D controller: Hewlett-Packard Company Visualize FXe (rev 03) > 01:04.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA > 300 TX4) (rev 02) > > Of which the last entry might well be the problem. > > This is a promise card for my 3* 1TB sata disks. They seem to be > initialized correctly, I made software raid with mdadm, but not I get > the following: > > $ cat /proc/mdstat > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] > [raid4] [raid10] > md0 : active raid5 sdc1[0] sde1[3] sdd1[1] > 1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] > [=>...................] recovery = 5.8% (56795292/976759936) > finish=451.9min speed=33920K/sec > > unused devices: <none> > > This keeps on running for a few hours, until the machine gets unresponsive. > > I don't think I get anything strange in syslog, kernel.log, messages,... > > So, my questions: > * Is this sata promise card the fault? Likely, yes. > * on the web, the errors on the display seemed to indicate hardware > problems, but any insight on that? HW caught the error. Unless this happens w/o Promise care present, I'm not inclined to believe this is a HW problem. > * Best ways to solve this? Capture "ser pim" output (aka PIM dump). > * Did I forget something? > > Dirk > > PS: Next thing I'll try is to remove the promise card to see if that was > the problem > PPS: dmesg in attach thanks! grant > > -- > Dirk Van Hertem Dirk.VanHertem@esat.kuleuven.be > Electrical Engineering Department http://www.esat.kuleuven.be/electa > K.U. Leuven, ESAT-ELECTA tel: +32-16-32.18.95 > 10, Kasteelpark Arenberg, B-3001 Heverlee fax: +32-16-32.19.85 > [ 0.000000] Initializing cgroup subsys cpu > [ 0.000000] Linux version 2.6.26-2-parisc (Debian 2.6.26-15) (dannf@debian.org) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 Fri Mar 27 03:29:17 UTC 2009 > [ 0.000000] FP[0] enabled: Rev 1 Model 16 > [ 0.000000] The 32-bit Kernel has started... > [ 0.000000] console [ttyB0] enabled > [ 0.000000] Initialized PDC Console for debugging. > [ 0.000000] Determining PDC firmware type: System Map. > [ 0.000000] model 00005d00 00000481 00000000 00000002 782d3480 100000f0 00000008 000000b2 000000b2 > [ 0.000000] vers 00000301 > [ 0.000000] CPUID vers 17 rev 11 (0x0000022b) > [ 0.000000] capabilities 0x3 > [ 0.000000] model 9000/785/B2000 > [ 0.000000] Total Memory: 1024 MB > [ 0.000000] initrd: 4f8ce000-4ffedfb5 > [ 0.000000] initrd: reserving 3f8ce000-3ffedfb5 (mem_max 40000000) > [ 0.000000] On node 0 totalpages: 262144 > [ 0.000000] Normal zone: 2048 pages used for memmap > [ 0.000000] Normal zone: 0 pages reserved > [ 0.000000] Normal zone: 260096 pages, LIFO batch:31 > [ 0.000000] Movable zone: 0 pages used for memmap > [ 0.000000] LCD display at f05d0008,f05d0000 registered > [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260096 > [ 0.000000] Kernel command line: root=/dev/sdb5 HOME=/ console=ttyS0 TERM=vt102 palo_kernel=2/vmlinux > [ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes) > [17179569.184000] Console: colour dummy device 160x64 > [17179569.248000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) > [17179569.348000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) > [17179569.532000] Memory: 1026560k/1048576k available (1961k kernel code, 21792k reserved, 882k data, 224k init) > [17179569.660000] virtual kernel memory layout: > [17179569.660000] vmalloc : 0x00008000 - 0x0f000000 ( 239 MB) > [17179569.660000] memory : 0x10000000 - 0x50000000 (1024 MB) > [17179569.660000] .init : 0x10410000 - 0x10448000 ( 224 kB) > [17179569.660000] .data : 0x102ea6b4 - 0x103c7000 ( 882 kB) > [17179569.660000] .text : 0x10100000 - 0x102ea6b4 (1961 kB) > [17179570.112000] Calibrating delay loop... 798.72 BogoMIPS (lpj=1597440) > [17179570.204000] Security Framework initialized > [17179570.260000] SELinux: Disabled at boot. > [17179570.312000] Capability LSM initialized > [17179570.368000] Mount-cache hash table entries: 512 > [17179570.428000] Initializing cgroup subsys ns > [17179570.484000] Initializing cgroup subsys cpuacct > [17179570.548000] Initializing cgroup subsys devices > [17179570.612000] net_namespace: 648 bytes > [17179570.660000] NET: Registered protocol family 16 > [17179570.724000] EISA bus registered > [17179570.768000] Searching for devices... > [17179571.020000] Found devices: > [17179571.060000] 1. Astro BC Runway Port at 0xfed00000 [10] { 12, 0x0, 0x582, 0x0000b } > [17179571.160000] 2. Elroy PCI Bridge at 0xfed30000 [10/0] { 13, 0x0, 0x782, 0x0000a } > [17179571.264000] 3. Elroy PCI Bridge at 0xfed32000 [10/1] { 13, 0x0, 0x782, 0x0000a } > [17179571.364000] 4. Kazoo W+ at 0xfffa0000 [32] { 0, 0x0, 0x5d0, 0x00004 } > [17179571.452000] 5. Memory at 0xfed10200 [49] { 1, 0x0, 0x09d, 0x00009 } > [17179571.536000] Enabling regular chassis codes support v0.05 > [17179571.736000] CPU(s): 1 x PA8600 (PCX-W+) at 400.000000 MHz > [17179571.812000] Whole cache flush 115727 cycles, flushing 3440640 bytes 467519 cycles > [17179571.812000] Setting cache flush threshold to 1980 (1 CPUs online) > [17179571.920000] SBA found Astro 2.1 at 0xfed00000 > [17179571.984000] Elroy version TR4.0 (0x5) found at 0xfed30000 > [17179572.060000] PCI: Enabled native mode for NS87415 (pif=0x8f) > [17179572.140000] Elroy version TR4.0 (0x5) found at 0xfed32000 > [17179572.232000] powersw: Soft power switch at 0xf0400804 enabled. > [17179572.324000] NET: Registered protocol family 2 > [17179572.424000] IP route cache hash table entries: 32768 (order: 5, 131072 bytes) > [17179572.520000] TCP established hash table entries: 131072 (order: 8, 1048576 bytes) > [17179572.624000] TCP bind hash table entries: 65536 (order: 6, 262144 bytes) > [17179572.716000] TCP: Hash tables configured (established 131072 bind 65536) > [17179572.808000] TCP reno registered > [17179572.864000] NET: Registered protocol family 1 > [17179572.924000] checking if image is initramfs... it is > [17179575.908000] Freeing initrd memory: 7295k freed > [17179575.976000] Enabling PDC chassis warnings support v0.05 > [17179576.048000] unwind_init: start = 0x1035de10, end = 0x103850f0, entries = 10030 > [17179576.144000] WARNING: Out of order unwind entry! 1035f810 and 1035f820 > [17179576.232000] WARNING: Out of order unwind entry! 1035f820 and 1035f830 > [17179576.324000] audit: initializing netlink socket (disabled) > [17179576.396000] type=2000 audit(1241188257.212:1): initialized > [17179576.472000] VFS: Disk quotas dquot_6.5.1 > [17179576.528000] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) > [17179576.620000] msgmni has been set to 2019 > [17179576.676000] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) > [17179576.776000] io scheduler noop registered > [17179576.832000] io scheduler anticipatory registered > [17179576.896000] io scheduler deadline registered > [17179576.956000] io scheduler cfq registered (default) > [17179577.020000] SuperIO: Found NS87560 Legacy I/O device at 0000:00:0e.1 (IRQ 67) > [17179577.120000] SuperIO: Serial port 1 at 0x3f8 > [17179577.176000] SuperIO: Serial port 2 at 0x2f8 > [17179577.236000] SuperIO: Parallel port at 0x378 > [17179577.292000] SuperIO: Floppy controller at 0x3f0 > [17179577.356000] SuperIO: ACPI at 0x7e0 > [17179577.404000] SuperIO: USB regulator enabled > [17179577.464000] PDC Stable Storage facility v0.30 > [17179577.836000] STI GSC/PCI core graphics driver Version 0.9a > [17179577.912000] sti 0000:01:00.0: enabling SERR and PARITY (0046 -> 0146) > [17179578.000000] STI PCI graphic ROM found at f4840000 (128 kB), fb at fb000000 (16 MB) > [17179578.228000] id 35acda16-9a02587, conforms to spec rev. 8.0c > [17179578.312000] graphics card name: HPA4982A > [17179578.368000] sticon: Initializing STI text console. > [17179578.436000] Console: switching to colour STI console 160x64 > [17179578.788000] stifb: 'HPA4982A' (id: 0x35acda16) not supported. > [17179578.876000] Generic RTC Driver v1.07 > [17179578.928000] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled > [17179579.036000] serial8250: ttyS0 at I/O 0x3f8 (irq = 3) is a 16550A > [17179579.116000] console handover: boot [ttyB0] -> real [ttyS0] > [17179579.192000] serial8250: ttyS1 at I/O 0x2f8 (irq = 4) is a 16550A > [17179579.276000] brd: module loaded > [17179579.316000] mice: PS/2 mouse device common for all mice > [17179579.384000] TCP cubic registered > [17179579.424000] NET: Registered protocol family 17 > [17179579.484000] registered taskstats version 1 > [17179579.536000] Freeing unused kernel memory: 224k freed > [17179580.348000] SCSI subsystem initialized > [17179581.996000] Linux Tulip driver version 1.1.15-NAPI (Feb 27, 2007) > [17179582.080000] tulip0: no phy info, aborting mtable build > [17179582.144000] tulip0: MII transceiver #1 config 1000 status 782d advertising 01e1. > [17179582.248000] eth0: Digital DS21142/43 Tulip rev 65 at MMIO 0xf4005000, 00:30:6e:08:0a:7f, IRQ 65. > [17179582.544000] usbcore: registered new interface driver usbfs > [17179582.612000] usbcore: registered new interface driver hub > [17179582.704000] sym0: <895a> rev 0x1 at pci 0000:00:0f.0 irq 68 > [17179582.804000] libata version 3.00 loaded. > [17179582.824000] sym0: PA-RISC Firmware, ID 7, Fast-40, LVD, parity checking > [17179582.904000] sym0: SCSI BUS has been reset. > [17179582.964000] scsi0 : sym-2.2.3 > [17179583.008000] usbcore: registered new device driver usb > [17179583.096000] ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver > [17179583.096000] ohci_hcd: block sizes: ed 64 td 64 > [17179583.100000] ohci_hcd 0000:00:0e.2: OHCI Host Controller > [17179583.168000] ohci_hcd 0000:00:0e.2: new USB bus registered, assigned bus number 1 > [17179583.260000] ohci_hcd 0000:00:0e.2: Using NSC SuperIO setup > [17179583.260000] ohci_hcd 0000:00:0e.2: created debug files > [17179583.260000] ohci_hcd 0000:00:0e.2: irq 1, io mem 0xf4004000 > [17179583.384000] ohci_hcd 0000:00:0e.2: OHCI controller state > [17179583.384000] ohci_hcd 0000:00:0e.2: OHCI 1.0, NO legacy support registers > [17179583.384000] ohci_hcd 0000:00:0e.2: control 0x083 HCFS=operational CBSR=3 > [17179583.384000] ohci_hcd 0000:00:0e.2: cmdstatus 0x00000 SOC=0 > [17179583.384000] ohci_hcd 0000:00:0e.2: intrstatus 0x00000000 > [17179583.384000] ohci_hcd 0000:00:0e.2: intrenable 0x8000001a MIE UE RD WDH > [17179583.384000] ohci_hcd 0000:00:0e.2: hcca frame #0000 > [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.a 00001003 POTPGT=0 NOCP NDP=3(3) > [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.b 000e0000 PPCM=000e DR=0000 > [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.status 00008000 DRWE > [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.portstatus [0] 0x00000100 PPS > [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.portstatus [1] 0x00000100 PPS > [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.portstatus [2] 0x00000100 PPS > [17179583.384000] usb usb1: default language 0x0409 > [17179583.384000] usb usb1: uevent > [17179583.384000] usb usb1: usb_probe_device > [17179583.384000] usb usb1: configuration #1 chosen from 1 choice > [17179583.452000] usb usb1: adding 1-0:1.0 (config #1, interface 0) > [17179583.452000] usb 1-0:1.0: uevent > [17179583.452000] hub 1-0:1.0: usb_probe_interface > [17179583.452000] hub 1-0:1.0: usb_probe_interface - got id > [17179583.452000] hub 1-0:1.0: USB hub found > [17179583.500000] hub 1-0:1.0: 3 ports detected > [17179583.552000] hub 1-0:1.0: standalone hub > [17179583.552000] hub 1-0:1.0: ganged power switching > [17179583.552000] hub 1-0:1.0: no over-current protection > [17179583.552000] hub 1-0:1.0: power on to power good time: 0ms > [17179583.552000] hub 1-0:1.0: local power source is good > [17179583.552000] hub 1-0:1.0: enabling power on all ports > [17179583.656000] usb usb1: New USB device found, idVendor=1d6b, idProduct=0001 > [17179583.740000] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 > [17179583.828000] usb usb1: Product: OHCI Host Controller > [17179583.892000] usb usb1: Manufacturer: Linux 2.6.26-2-parisc ohci_hcd > [17179583.968000] usb usb1: SerialNumber: 0000:00:0e.2 > [17179584.032000] sata_promise 0000:01:04.0: version 2.12 > [17179584.032000] scsi1 : sata_promise > [17179584.112000] Uniform Multi-Platform E-IDE driver > [17179584.168000] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx > [17179584.268000] scsi2 : sata_promise > [17179584.312000] scsi3 : sata_promise > [17179584.352000] scsi4 : sata_promise > [17179584.396000] ata1: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820380 irq 70 > [17179584.484000] ata2: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820280 irq 70 > [17179584.576000] ata3: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820200 irq 70 > [17179584.668000] ata4: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820300 irq 70 > [17179584.760000] hub 1-0:1.0: state 7 ports 3 chg 0000 evt 0000 > [17179584.792000] NS87415: IDE controller (0x100b:0x0002 rev 0x03) at PCI slot 0000:00:0e.0 > [17179584.892000] NS87415: 100% native mode on irq 7 > [17179584.948000] ide0: BM-DMA at 0x0900-0x0907 > [17179585.008000] ide1: BM-DMA at 0x0908-0x090f > [17179585.060000] Probing IDE interface ide0... > [17179585.184000] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [17179585.284000] ata1.00: ATA-8: Hitachi HDT721010SLA360, ST6OA31B, max UDMA/133 > [17179585.368000] ata1.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 0/32) > [17179585.468000] ata1.00: configured for UDMA/133 > [17179585.692000] hda: FX4830T, ATAPI CD/DVD-ROM drive > [17179585.856000] ata2: SATA link down (SStatus 0 SControl 300) > [17179586.240000] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [17179586.340000] ata3.00: ATA-8: Hitachi HDT721010SLA360, ST6OA31B, max UDMA/133 > [17179586.424000] ata3.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 0/32) > [17179586.524000] ata3.00: configured for UDMA/133 > [17179586.692000] Probing IDE interface ide1... > [17179586.896000] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > [17179586.996000] ata4.00: ATA-8: Hitachi HDT721010SLA360, ST6OA31B, max UDMA/133 > [17179587.080000] ata4.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 0/32) > [17179587.180000] ata4.00: configured for UDMA/133 > [17179587.232000] scsi: waiting for bus probes to complete ... > [17179587.524000] ide0 at 0xe00-0xe07,0xd02 on irq 7 > [17179587.580000] ide1 at 0xb00-0xb07,0xa02 on irq 7 > [17179587.636000] scsi 0:0:5:0: Direct-Access IBM IC35L073UCDY10-0 S27T PQ: 0 ANSI: 3 > [17179587.736000] target0:0:5: tagged command queuing enabled, command queue depth 16. > [17179587.828000] target0:0:5: Beginning Domain Validation > [17179587.892000] target0:0:5: asynchronous > [17179587.944000] target0:0:5: wide asynchronous > [17179587.996000] target0:0:5: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31) > [17179588.084000] target0:0:5: Domain Validation skipping write tests > [17179588.160000] target0:0:5: Ending Domain Validation > [17179588.220000] scsi 0:0:6:0: Direct-Access QUANTUM ATLAS5-9LVD HP04 PQ: 0 ANSI: 3 > [17179588.320000] target0:0:6: tagged command queuing enabled, command queue depth 16. > [17179588.412000] target0:0:6: Beginning Domain Validation > [17179588.476000] target0:0:6: asynchronous > [17179588.528000] target0:0:6: wide asynchronous > [17179588.580000] target0:0:6: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31) > [17179588.668000] target0:0:6: Domain Validation skipping write tests > [17179588.744000] target0:0:6: Ending Domain Validation > [17179591.132000] scsi 1:0:0:0: Direct-Access ATA Hitachi HDT72101 ST6O PQ: 0 ANSI: 5 > [17179591.304000] scsi 3:0:0:0: Direct-Access ATA Hitachi HDT72101 ST6O PQ: 0 ANSI: 5 > [17179591.440000] Driver 'sd' needs updating - please use bus_type methods > [17179591.608000] scsi 4:0:0:0: Direct-Access ATA Hitachi HDT72101 ST6O PQ: 0 ANSI: 5 > [17179591.732000] sd 0:0:5:0: [sda] 143374650 512-byte hardware sectors (73408 MB) > [17179592.108000] sd 0:0:5:0: [sda] Write Protect is off > [17179592.168000] sd 0:0:5:0: [sda] Mode Sense: cb 00 00 08 > [17179592.296000] hda: ATAPI 48X CD-ROM drive, 128kB Cache > [17179592.356000] Uniform CD-ROM driver Revision: 3.20 > [17179592.524000] sd 0:0:5:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA > [17179592.640000] sd 0:0:5:0: [sda] 143374650 512-byte hardware sectors (73408 MB) > [17179592.728000] sd 0:0:5:0: [sda] Write Protect is off > [17179592.788000] sd 0:0:5:0: [sda] Mode Sense: cb 00 00 08 > [17179592.788000] sd 0:0:5:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA > [17179592.904000] sda: sda1 > [17179592.964000] sd 0:0:5:0: [sda] Attached SCSI disk > [17179593.044000] sd 0:0:6:0: [sdb] 17773524 512-byte hardware sectors (9100 MB) > [17179593.140000] sd 0:0:6:0: [sdb] Write Protect is off > [17179593.204000] sd 0:0:6:0: [sdb] Mode Sense: e3 00 10 08 > [17179593.232000] sd 0:0:6:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA > [17179593.348000] sd 0:0:6:0: [sdb] 17773524 512-byte hardware sectors (9100 MB) > [17179593.432000] sd 0:0:6:0: [sdb] Write Protect is off > [17179593.496000] sd 0:0:6:0: [sdb] Mode Sense: e3 00 10 08 > [17179593.496000] sd 0:0:6:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA > [17179593.600000] sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 sdb7 sdb8 > sdb4 > [17179593.724000] sd 0:0:6:0: [sdb] Attached SCSI disk > [17179593.792000] sd 1:0:0:0: [sdc] 1953525168 512-byte hardware sectors (1000205 MB) > [17179593.884000] sd 1:0:0:0: [sdc] Write Protect is off > [17179593.944000] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > [17179593.944000] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [17179594.056000] sd 1:0:0:0: [sdc] 1953525168 512-byte hardware sectors (1000205 MB) > [17179594.148000] sd 1:0:0:0: [sdc] Write Protect is off > [17179594.208000] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > [17179594.208000] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [17179594.320000] sdc: sdc1 > [17179594.364000] sd 1:0:0:0: [sdc] Attached SCSI disk > [17179594.432000] sd 3:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB) > [17179594.520000] sd 3:0:0:0: [sdd] Write Protect is off > [17179594.584000] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 > [17179594.584000] sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [17179594.696000] sd 3:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB) > [17179594.784000] sd 3:0:0:0: [sdd] Write Protect is off > [17179594.844000] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 > [17179594.844000] sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [17179594.956000] sdd: sdd1 > [17179595.004000] sd 3:0:0:0: [sdd] Attached SCSI disk > [17179595.068000] sd 4:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB) > [17179595.160000] sd 4:0:0:0: [sde] Write Protect is off > [17179595.220000] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00 > [17179595.220000] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [17179595.332000] sd 4:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB) > [17179595.424000] sd 4:0:0:0: [sde] Write Protect is off > [17179595.484000] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00 > [17179595.484000] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [17179595.596000] sde: sde1 > [17179595.640000] sd 4:0:0:0: [sde] Attached SCSI disk > [17179596.980000] md: linear personality registered for level -1 > [17179597.088000] md: multipath personality registered for level -4 > [17179597.200000] md: raid0 personality registered for level 0 > [17179597.312000] md: raid1 personality registered for level 1 > [17179597.416000] xor: measuring software checksum speed > [17179597.496000] 8regs : 943.000 MB/sec > [17179597.568000] 8regs_prefetch: 933.000 MB/sec > [17179597.644000] 32regs : 960.000 MB/sec > [17179597.716000] 32regs_prefetch: 945.000 MB/sec > [17179597.772000] xor: using function: 32regs (960.000 MB/sec) > [17179597.848000] async_tx: api initialized (sync-only) > [17179598.012000] raid6: int32x1 183 MB/s > [17179598.128000] raid6: int32x2 230 MB/s > [17179598.244000] raid6: int32x4 272 MB/s > [17179598.360000] raid6: int32x8 213 MB/s > [17179598.408000] raid6: using algorithm int32x4 (272 MB/s) > [17179598.468000] md: raid6 personality registered for level 6 > [17179598.536000] md: raid5 personality registered for level 5 > [17179598.604000] md: raid4 personality registered for level 4 > [17179598.896000] md: raid10 personality registered for level 10 > [17179599.136000] md: bind<sdd1> > [17179599.172000] md: bind<sde1> > [17179599.208000] md: bind<sdc1> > [17179599.324000] raid5: device sdc1 operational as raid disk 0 > [17179599.392000] raid5: device sdd1 operational as raid disk 1 > [17179599.460000] raid5: allocated 3176kB for md0 > [17179599.512000] raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2 > [17179599.608000] RAID5 conf printout: > [17179599.648000] --- rd:3 wd:2 > [17179599.684000] disk 0, o:1, dev:sdc1 > [17179599.728000] disk 1, o:1, dev:sdd1 > [17179600.672000] EXT3-fs: INFO: recovery required on readonly filesystem. > [17179600.752000] EXT3-fs: write access will be enabled during recovery. > [17179601.128000] kjournald starting. Commit interval 5 seconds > [17179601.196000] EXT3-fs: recovery complete. > [17179601.244000] EXT3-fs: mounted filesystem with ordered data mode. > [17179603.920000] udevd version 125 started > [17179604.124000] usb usb1: uevent > [17179604.124000] usb 1-0:1.0: uevent > [17179609.280000] Adding 489940k swap on /dev/sdb7. Priority:-1 extents:1 across:489940k > [17179609.764000] EXT3 FS on sdb5, internal journal > [17179610.860000] LASI 82596 driver - Revision: 1.30 > [17179610.980000] loop: module loaded > [17179616.900000] RAID5 conf printout: > [17179616.944000] --- rd:3 wd:2 > [17179616.980000] disk 0, o:1, dev:sdc1 > [17179617.024000] disk 1, o:1, dev:sdd1 > [17179617.072000] disk 2, o:1, dev:sde1 > [17179617.116000] md: recovery of RAID array md0 > [17179617.168000] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > [17179617.240000] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. > [17179617.360000] md: using 128k window, over a total of 976759936 blocks. > [17179632.056000] kjournald starting. Commit interval 5 seconds > [17179632.140000] EXT3 FS on sdb8, internal journal > [17179632.196000] EXT3-fs: mounted filesystem with ordered data mode. > [17179632.312000] kjournald starting. Commit interval 5 seconds > [17179632.392000] EXT3 FS on sdb4, internal journal > [17179632.448000] EXT3-fs: mounted filesystem with ordered data mode. > [17179632.556000] kjournald starting. Commit interval 5 seconds > [17179632.632000] EXT3 FS on sdb6, internal journal > [17179632.688000] EXT3-fs: mounted filesystem with ordered data mode. > [17179632.816000] kjournald starting. Commit interval 5 seconds > [17179632.920000] EXT3 FS on sda1, internal journal > [17179632.976000] EXT3-fs: mounted filesystem with ordered data mode. > [17179633.408000] kjournald starting. Commit interval 5 seconds > [17179633.512000] EXT3 FS on md0, internal journal > [17179633.564000] EXT3-fs: mounted filesystem with ordered data mode. > [17179639.712000] eth0: Setting full-duplex based on MII#1 link partner capability of 45e1. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: random freezes B2000 running debian hppa lenny 2009-05-03 11:25 ` random freezes B2000 running debian hppa lenny Grant Grundler @ 2009-05-03 23:07 ` Dirk Van Hertem 2009-05-15 22:40 ` Dirk Van Hertem 1 sibling, 0 replies; 6+ messages in thread From: Dirk Van Hertem @ 2009-05-03 23:07 UTC (permalink / raw) To: Grant Grundler; +Cc: linux-parisc Hello Grant, Thanks for the reply. Grant Grundler wrote: [Dirk's problems with HP and promise card removed] >> So, my questions: >> * Is this sata promise card the fault? > > Likely, yes. > >> * on the web, the errors on the display seemed to indicate hardware >> problems, but any insight on that? > > HW caught the error. Unless this happens w/o Promise care present, > I'm not inclined to believe this is a HW problem. I am now running it without the promise card, I'll keep you informed whether it blocks or not (I'll run a small program on it to give it some cpu load, as the mdadm raid stuff also gave it quite some cpu load and it still may be a HW fault). > >> * Best ways to solve this? > > Capture "ser pim" output (aka PIM dump). If that doesn't kill the machine in a day or so, I'll make sure I'll get some output from the serial console with the promise card attached. My VT220 seems to have died recently (do you happen to know what a black screen and blinking "hold screen" and "lock" lights mean on a real VT220?). In case I don't get the VT220 working, I'll make sure I'll connect using minicom (if I just find that nullmodem cable and the adapter of that old laptop with serial port :P). Thanks for the help! Dirk ps: I could put my sata disks in an old i386 of course, but I like the hp parisc better... -- Dirk Van Hertem Dirk.VanHertem@esat.kuleuven.be Electrical Engineering Department http://www.esat.kuleuven.be/electa K.U. Leuven, ESAT-ELECTA tel: +32-16-32.18.95 10, Kasteelpark Arenberg, B-3001 Heverlee fax: +32-16-32.19.85 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: random freezes B2000 running debian hppa lenny 2009-05-03 11:25 ` random freezes B2000 running debian hppa lenny Grant Grundler 2009-05-03 23:07 ` Dirk Van Hertem @ 2009-05-15 22:40 ` Dirk Van Hertem 2009-05-18 3:04 ` Grant Grundler 1 sibling, 1 reply; 6+ messages in thread From: Dirk Van Hertem @ 2009-05-15 22:40 UTC (permalink / raw) To: Grant Grundler; +Cc: linux-parisc [-- Attachment #1: Type: text/plain, Size: 26590 bytes --] Dear Grant, Dear linux-parisc enthousiasts, Sorry for the late reply: in the last week, my vt220 terminal died and the power supply of my old (i386) server died as well, so I was busy with other things. I attached the "ser pim" output to this email, I hope it helps. If you need any other information, please ask, I hope I'll be more responsive next time... Dirk Grant Grundler wrote: > [ moved debian-hppa to BCC and added linux-parisc to CC ] > > On Fri, May 01, 2009 at 05:08:59PM +0200, Dirk Van Hertem wrote: >> hello, >> >> My hppa box (B2000) experiences some problems: it freezes after a few >> (2-6) hours. >> >> On the led display I get the following error codes: >> >> FLT CBFC: SYS BD >> bus timeout >> OS HPMC bz err >> Bad OS HPMC len >> HPMC initiated > > Hi Dirk, > Given the PCI listing you gave below, I agree the HPMC is likely caused by > the Promise SATA card. > >> I don't have screen nor keyboard attached to it so debugging is a bit >> difficult. > > AFAIK, the only way to debug this is to capture the HPMC dump. > The HPMC dump can only be capture via serial console. :( > (ie run "ser pim" at PDC prompt from a terminal emulator like minicom) > >> System: >> Debian lenny (stable), rather clean install >> >> $ uname -a >> Linux coulomb 2.6.26-2-parisc #1 Fri Mar 27 03:29:17 UTC 2009 parisc >> GNU/Linux >> >> >> The machines has the following lspci: >> >> 00:0c.0 Ethernet controller: Digital Equipment Corporation DECchip >> 21142/43 (rev 41) >> 00:0d.0 Multimedia audio controller: Analog Devices AD1889 sound chip >> 00:0e.0 IDE interface: National Semiconductor Corporation 87415/87560 >> IDE (rev 03) >> 00:0e.1 Bridge: National Semiconductor Corporation 87560 Legacy I/O (rev 01) >> 00:0e.2 USB Controller: National Semiconductor Corporation USB >> Controller (rev 02) >> 00:0f.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895a (rev 01) >> 01:00.0 3D controller: Hewlett-Packard Company Visualize FXe (rev 03) >> 01:04.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA >> 300 TX4) (rev 02) >> >> Of which the last entry might well be the problem. >> >> This is a promise card for my 3* 1TB sata disks. They seem to be >> initialized correctly, I made software raid with mdadm, but not I get >> the following: >> >> $ cat /proc/mdstat >> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] >> [raid4] [raid10] >> md0 : active raid5 sdc1[0] sde1[3] sdd1[1] >> 1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] >> [=>...................] recovery = 5.8% (56795292/976759936) >> finish=451.9min speed=33920K/sec >> >> unused devices: <none> >> >> This keeps on running for a few hours, until the machine gets unresponsive. >> >> I don't think I get anything strange in syslog, kernel.log, messages,... >> >> So, my questions: >> * Is this sata promise card the fault? > > Likely, yes. > >> * on the web, the errors on the display seemed to indicate hardware >> problems, but any insight on that? > > HW caught the error. Unless this happens w/o Promise care present, > I'm not inclined to believe this is a HW problem. > >> * Best ways to solve this? > > Capture "ser pim" output (aka PIM dump). > >> * Did I forget something? >> >> Dirk >> >> PS: Next thing I'll try is to remove the promise card to see if that was >> the problem >> PPS: dmesg in attach > > thanks! > grant > >> -- >> Dirk Van Hertem Dirk.VanHertem@esat.kuleuven.be >> Electrical Engineering Department http://www.esat.kuleuven.be/electa >> K.U. Leuven, ESAT-ELECTA tel: +32-16-32.18.95 >> 10, Kasteelpark Arenberg, B-3001 Heverlee fax: +32-16-32.19.85 > >> [ 0.000000] Initializing cgroup subsys cpu >> [ 0.000000] Linux version 2.6.26-2-parisc (Debian 2.6.26-15) (dannf@debian.org) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 Fri Mar 27 03:29:17 UTC 2009 >> [ 0.000000] FP[0] enabled: Rev 1 Model 16 >> [ 0.000000] The 32-bit Kernel has started... >> [ 0.000000] console [ttyB0] enabled >> [ 0.000000] Initialized PDC Console for debugging. >> [ 0.000000] Determining PDC firmware type: System Map. >> [ 0.000000] model 00005d00 00000481 00000000 00000002 782d3480 100000f0 00000008 000000b2 000000b2 >> [ 0.000000] vers 00000301 >> [ 0.000000] CPUID vers 17 rev 11 (0x0000022b) >> [ 0.000000] capabilities 0x3 >> [ 0.000000] model 9000/785/B2000 >> [ 0.000000] Total Memory: 1024 MB >> [ 0.000000] initrd: 4f8ce000-4ffedfb5 >> [ 0.000000] initrd: reserving 3f8ce000-3ffedfb5 (mem_max 40000000) >> [ 0.000000] On node 0 totalpages: 262144 >> [ 0.000000] Normal zone: 2048 pages used for memmap >> [ 0.000000] Normal zone: 0 pages reserved >> [ 0.000000] Normal zone: 260096 pages, LIFO batch:31 >> [ 0.000000] Movable zone: 0 pages used for memmap >> [ 0.000000] LCD display at f05d0008,f05d0000 registered >> [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260096 >> [ 0.000000] Kernel command line: root=/dev/sdb5 HOME=/ console=ttyS0 TERM=vt102 palo_kernel=2/vmlinux >> [ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes) >> [17179569.184000] Console: colour dummy device 160x64 >> [17179569.248000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) >> [17179569.348000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) >> [17179569.532000] Memory: 1026560k/1048576k available (1961k kernel code, 21792k reserved, 882k data, 224k init) >> [17179569.660000] virtual kernel memory layout: >> [17179569.660000] vmalloc : 0x00008000 - 0x0f000000 ( 239 MB) >> [17179569.660000] memory : 0x10000000 - 0x50000000 (1024 MB) >> [17179569.660000] .init : 0x10410000 - 0x10448000 ( 224 kB) >> [17179569.660000] .data : 0x102ea6b4 - 0x103c7000 ( 882 kB) >> [17179569.660000] .text : 0x10100000 - 0x102ea6b4 (1961 kB) >> [17179570.112000] Calibrating delay loop... 798.72 BogoMIPS (lpj=1597440) >> [17179570.204000] Security Framework initialized >> [17179570.260000] SELinux: Disabled at boot. >> [17179570.312000] Capability LSM initialized >> [17179570.368000] Mount-cache hash table entries: 512 >> [17179570.428000] Initializing cgroup subsys ns >> [17179570.484000] Initializing cgroup subsys cpuacct >> [17179570.548000] Initializing cgroup subsys devices >> [17179570.612000] net_namespace: 648 bytes >> [17179570.660000] NET: Registered protocol family 16 >> [17179570.724000] EISA bus registered >> [17179570.768000] Searching for devices... >> [17179571.020000] Found devices: >> [17179571.060000] 1. Astro BC Runway Port at 0xfed00000 [10] { 12, 0x0, 0x582, 0x0000b } >> [17179571.160000] 2. Elroy PCI Bridge at 0xfed30000 [10/0] { 13, 0x0, 0x782, 0x0000a } >> [17179571.264000] 3. Elroy PCI Bridge at 0xfed32000 [10/1] { 13, 0x0, 0x782, 0x0000a } >> [17179571.364000] 4. Kazoo W+ at 0xfffa0000 [32] { 0, 0x0, 0x5d0, 0x00004 } >> [17179571.452000] 5. Memory at 0xfed10200 [49] { 1, 0x0, 0x09d, 0x00009 } >> [17179571.536000] Enabling regular chassis codes support v0.05 >> [17179571.736000] CPU(s): 1 x PA8600 (PCX-W+) at 400.000000 MHz >> [17179571.812000] Whole cache flush 115727 cycles, flushing 3440640 bytes 467519 cycles >> [17179571.812000] Setting cache flush threshold to 1980 (1 CPUs online) >> [17179571.920000] SBA found Astro 2.1 at 0xfed00000 >> [17179571.984000] Elroy version TR4.0 (0x5) found at 0xfed30000 >> [17179572.060000] PCI: Enabled native mode for NS87415 (pif=0x8f) >> [17179572.140000] Elroy version TR4.0 (0x5) found at 0xfed32000 >> [17179572.232000] powersw: Soft power switch at 0xf0400804 enabled. >> [17179572.324000] NET: Registered protocol family 2 >> [17179572.424000] IP route cache hash table entries: 32768 (order: 5, 131072 bytes) >> [17179572.520000] TCP established hash table entries: 131072 (order: 8, 1048576 bytes) >> [17179572.624000] TCP bind hash table entries: 65536 (order: 6, 262144 bytes) >> [17179572.716000] TCP: Hash tables configured (established 131072 bind 65536) >> [17179572.808000] TCP reno registered >> [17179572.864000] NET: Registered protocol family 1 >> [17179572.924000] checking if image is initramfs... it is >> [17179575.908000] Freeing initrd memory: 7295k freed >> [17179575.976000] Enabling PDC chassis warnings support v0.05 >> [17179576.048000] unwind_init: start = 0x1035de10, end = 0x103850f0, entries = 10030 >> [17179576.144000] WARNING: Out of order unwind entry! 1035f810 and 1035f820 >> [17179576.232000] WARNING: Out of order unwind entry! 1035f820 and 1035f830 >> [17179576.324000] audit: initializing netlink socket (disabled) >> [17179576.396000] type=2000 audit(1241188257.212:1): initialized >> [17179576.472000] VFS: Disk quotas dquot_6.5.1 >> [17179576.528000] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) >> [17179576.620000] msgmni has been set to 2019 >> [17179576.676000] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) >> [17179576.776000] io scheduler noop registered >> [17179576.832000] io scheduler anticipatory registered >> [17179576.896000] io scheduler deadline registered >> [17179576.956000] io scheduler cfq registered (default) >> [17179577.020000] SuperIO: Found NS87560 Legacy I/O device at 0000:00:0e.1 (IRQ 67) >> [17179577.120000] SuperIO: Serial port 1 at 0x3f8 >> [17179577.176000] SuperIO: Serial port 2 at 0x2f8 >> [17179577.236000] SuperIO: Parallel port at 0x378 >> [17179577.292000] SuperIO: Floppy controller at 0x3f0 >> [17179577.356000] SuperIO: ACPI at 0x7e0 >> [17179577.404000] SuperIO: USB regulator enabled >> [17179577.464000] PDC Stable Storage facility v0.30 >> [17179577.836000] STI GSC/PCI core graphics driver Version 0.9a >> [17179577.912000] sti 0000:01:00.0: enabling SERR and PARITY (0046 -> 0146) >> [17179578.000000] STI PCI graphic ROM found at f4840000 (128 kB), fb at fb000000 (16 MB) >> [17179578.228000] id 35acda16-9a02587, conforms to spec rev. 8.0c >> [17179578.312000] graphics card name: HPA4982A >> [17179578.368000] sticon: Initializing STI text console. >> [17179578.436000] Console: switching to colour STI console 160x64 >> [17179578.788000] stifb: 'HPA4982A' (id: 0x35acda16) not supported. >> [17179578.876000] Generic RTC Driver v1.07 >> [17179578.928000] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled >> [17179579.036000] serial8250: ttyS0 at I/O 0x3f8 (irq = 3) is a 16550A >> [17179579.116000] console handover: boot [ttyB0] -> real [ttyS0] >> [17179579.192000] serial8250: ttyS1 at I/O 0x2f8 (irq = 4) is a 16550A >> [17179579.276000] brd: module loaded >> [17179579.316000] mice: PS/2 mouse device common for all mice >> [17179579.384000] TCP cubic registered >> [17179579.424000] NET: Registered protocol family 17 >> [17179579.484000] registered taskstats version 1 >> [17179579.536000] Freeing unused kernel memory: 224k freed >> [17179580.348000] SCSI subsystem initialized >> [17179581.996000] Linux Tulip driver version 1.1.15-NAPI (Feb 27, 2007) >> [17179582.080000] tulip0: no phy info, aborting mtable build >> [17179582.144000] tulip0: MII transceiver #1 config 1000 status 782d advertising 01e1. >> [17179582.248000] eth0: Digital DS21142/43 Tulip rev 65 at MMIO 0xf4005000, 00:30:6e:08:0a:7f, IRQ 65. >> [17179582.544000] usbcore: registered new interface driver usbfs >> [17179582.612000] usbcore: registered new interface driver hub >> [17179582.704000] sym0: <895a> rev 0x1 at pci 0000:00:0f.0 irq 68 >> [17179582.804000] libata version 3.00 loaded. >> [17179582.824000] sym0: PA-RISC Firmware, ID 7, Fast-40, LVD, parity checking >> [17179582.904000] sym0: SCSI BUS has been reset. >> [17179582.964000] scsi0 : sym-2.2.3 >> [17179583.008000] usbcore: registered new device driver usb >> [17179583.096000] ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver >> [17179583.096000] ohci_hcd: block sizes: ed 64 td 64 >> [17179583.100000] ohci_hcd 0000:00:0e.2: OHCI Host Controller >> [17179583.168000] ohci_hcd 0000:00:0e.2: new USB bus registered, assigned bus number 1 >> [17179583.260000] ohci_hcd 0000:00:0e.2: Using NSC SuperIO setup >> [17179583.260000] ohci_hcd 0000:00:0e.2: created debug files >> [17179583.260000] ohci_hcd 0000:00:0e.2: irq 1, io mem 0xf4004000 >> [17179583.384000] ohci_hcd 0000:00:0e.2: OHCI controller state >> [17179583.384000] ohci_hcd 0000:00:0e.2: OHCI 1.0, NO legacy support registers >> [17179583.384000] ohci_hcd 0000:00:0e.2: control 0x083 HCFS=operational CBSR=3 >> [17179583.384000] ohci_hcd 0000:00:0e.2: cmdstatus 0x00000 SOC=0 >> [17179583.384000] ohci_hcd 0000:00:0e.2: intrstatus 0x00000000 >> [17179583.384000] ohci_hcd 0000:00:0e.2: intrenable 0x8000001a MIE UE RD WDH >> [17179583.384000] ohci_hcd 0000:00:0e.2: hcca frame #0000 >> [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.a 00001003 POTPGT=0 NOCP NDP=3(3) >> [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.b 000e0000 PPCM=000e DR=0000 >> [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.status 00008000 DRWE >> [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.portstatus [0] 0x00000100 PPS >> [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.portstatus [1] 0x00000100 PPS >> [17179583.384000] ohci_hcd 0000:00:0e.2: roothub.portstatus [2] 0x00000100 PPS >> [17179583.384000] usb usb1: default language 0x0409 >> [17179583.384000] usb usb1: uevent >> [17179583.384000] usb usb1: usb_probe_device >> [17179583.384000] usb usb1: configuration #1 chosen from 1 choice >> [17179583.452000] usb usb1: adding 1-0:1.0 (config #1, interface 0) >> [17179583.452000] usb 1-0:1.0: uevent >> [17179583.452000] hub 1-0:1.0: usb_probe_interface >> [17179583.452000] hub 1-0:1.0: usb_probe_interface - got id >> [17179583.452000] hub 1-0:1.0: USB hub found >> [17179583.500000] hub 1-0:1.0: 3 ports detected >> [17179583.552000] hub 1-0:1.0: standalone hub >> [17179583.552000] hub 1-0:1.0: ganged power switching >> [17179583.552000] hub 1-0:1.0: no over-current protection >> [17179583.552000] hub 1-0:1.0: power on to power good time: 0ms >> [17179583.552000] hub 1-0:1.0: local power source is good >> [17179583.552000] hub 1-0:1.0: enabling power on all ports >> [17179583.656000] usb usb1: New USB device found, idVendor=1d6b, idProduct=0001 >> [17179583.740000] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 >> [17179583.828000] usb usb1: Product: OHCI Host Controller >> [17179583.892000] usb usb1: Manufacturer: Linux 2.6.26-2-parisc ohci_hcd >> [17179583.968000] usb usb1: SerialNumber: 0000:00:0e.2 >> [17179584.032000] sata_promise 0000:01:04.0: version 2.12 >> [17179584.032000] scsi1 : sata_promise >> [17179584.112000] Uniform Multi-Platform E-IDE driver >> [17179584.168000] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx >> [17179584.268000] scsi2 : sata_promise >> [17179584.312000] scsi3 : sata_promise >> [17179584.352000] scsi4 : sata_promise >> [17179584.396000] ata1: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820380 irq 70 >> [17179584.484000] ata2: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820280 irq 70 >> [17179584.576000] ata3: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820200 irq 70 >> [17179584.668000] ata4: SATA max UDMA/133 mmio m4096@0xf4820000 ata 0xf4820300 irq 70 >> [17179584.760000] hub 1-0:1.0: state 7 ports 3 chg 0000 evt 0000 >> [17179584.792000] NS87415: IDE controller (0x100b:0x0002 rev 0x03) at PCI slot 0000:00:0e.0 >> [17179584.892000] NS87415: 100% native mode on irq 7 >> [17179584.948000] ide0: BM-DMA at 0x0900-0x0907 >> [17179585.008000] ide1: BM-DMA at 0x0908-0x090f >> [17179585.060000] Probing IDE interface ide0... >> [17179585.184000] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) >> [17179585.284000] ata1.00: ATA-8: Hitachi HDT721010SLA360, ST6OA31B, max UDMA/133 >> [17179585.368000] ata1.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 0/32) >> [17179585.468000] ata1.00: configured for UDMA/133 >> [17179585.692000] hda: FX4830T, ATAPI CD/DVD-ROM drive >> [17179585.856000] ata2: SATA link down (SStatus 0 SControl 300) >> [17179586.240000] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) >> [17179586.340000] ata3.00: ATA-8: Hitachi HDT721010SLA360, ST6OA31B, max UDMA/133 >> [17179586.424000] ata3.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 0/32) >> [17179586.524000] ata3.00: configured for UDMA/133 >> [17179586.692000] Probing IDE interface ide1... >> [17179586.896000] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) >> [17179586.996000] ata4.00: ATA-8: Hitachi HDT721010SLA360, ST6OA31B, max UDMA/133 >> [17179587.080000] ata4.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 0/32) >> [17179587.180000] ata4.00: configured for UDMA/133 >> [17179587.232000] scsi: waiting for bus probes to complete ... >> [17179587.524000] ide0 at 0xe00-0xe07,0xd02 on irq 7 >> [17179587.580000] ide1 at 0xb00-0xb07,0xa02 on irq 7 >> [17179587.636000] scsi 0:0:5:0: Direct-Access IBM IC35L073UCDY10-0 S27T PQ: 0 ANSI: 3 >> [17179587.736000] target0:0:5: tagged command queuing enabled, command queue depth 16. >> [17179587.828000] target0:0:5: Beginning Domain Validation >> [17179587.892000] target0:0:5: asynchronous >> [17179587.944000] target0:0:5: wide asynchronous >> [17179587.996000] target0:0:5: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31) >> [17179588.084000] target0:0:5: Domain Validation skipping write tests >> [17179588.160000] target0:0:5: Ending Domain Validation >> [17179588.220000] scsi 0:0:6:0: Direct-Access QUANTUM ATLAS5-9LVD HP04 PQ: 0 ANSI: 3 >> [17179588.320000] target0:0:6: tagged command queuing enabled, command queue depth 16. >> [17179588.412000] target0:0:6: Beginning Domain Validation >> [17179588.476000] target0:0:6: asynchronous >> [17179588.528000] target0:0:6: wide asynchronous >> [17179588.580000] target0:0:6: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31) >> [17179588.668000] target0:0:6: Domain Validation skipping write tests >> [17179588.744000] target0:0:6: Ending Domain Validation >> [17179591.132000] scsi 1:0:0:0: Direct-Access ATA Hitachi HDT72101 ST6O PQ: 0 ANSI: 5 >> [17179591.304000] scsi 3:0:0:0: Direct-Access ATA Hitachi HDT72101 ST6O PQ: 0 ANSI: 5 >> [17179591.440000] Driver 'sd' needs updating - please use bus_type methods >> [17179591.608000] scsi 4:0:0:0: Direct-Access ATA Hitachi HDT72101 ST6O PQ: 0 ANSI: 5 >> [17179591.732000] sd 0:0:5:0: [sda] 143374650 512-byte hardware sectors (73408 MB) >> [17179592.108000] sd 0:0:5:0: [sda] Write Protect is off >> [17179592.168000] sd 0:0:5:0: [sda] Mode Sense: cb 00 00 08 >> [17179592.296000] hda: ATAPI 48X CD-ROM drive, 128kB Cache >> [17179592.356000] Uniform CD-ROM driver Revision: 3.20 >> [17179592.524000] sd 0:0:5:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA >> [17179592.640000] sd 0:0:5:0: [sda] 143374650 512-byte hardware sectors (73408 MB) >> [17179592.728000] sd 0:0:5:0: [sda] Write Protect is off >> [17179592.788000] sd 0:0:5:0: [sda] Mode Sense: cb 00 00 08 >> [17179592.788000] sd 0:0:5:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA >> [17179592.904000] sda: sda1 >> [17179592.964000] sd 0:0:5:0: [sda] Attached SCSI disk >> [17179593.044000] sd 0:0:6:0: [sdb] 17773524 512-byte hardware sectors (9100 MB) >> [17179593.140000] sd 0:0:6:0: [sdb] Write Protect is off >> [17179593.204000] sd 0:0:6:0: [sdb] Mode Sense: e3 00 10 08 >> [17179593.232000] sd 0:0:6:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA >> [17179593.348000] sd 0:0:6:0: [sdb] 17773524 512-byte hardware sectors (9100 MB) >> [17179593.432000] sd 0:0:6:0: [sdb] Write Protect is off >> [17179593.496000] sd 0:0:6:0: [sdb] Mode Sense: e3 00 10 08 >> [17179593.496000] sd 0:0:6:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA >> [17179593.600000] sdb: sdb1 sdb2 sdb3 < sdb5 sdb6 sdb7 sdb8 > sdb4 >> [17179593.724000] sd 0:0:6:0: [sdb] Attached SCSI disk >> [17179593.792000] sd 1:0:0:0: [sdc] 1953525168 512-byte hardware sectors (1000205 MB) >> [17179593.884000] sd 1:0:0:0: [sdc] Write Protect is off >> [17179593.944000] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00 >> [17179593.944000] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA >> [17179594.056000] sd 1:0:0:0: [sdc] 1953525168 512-byte hardware sectors (1000205 MB) >> [17179594.148000] sd 1:0:0:0: [sdc] Write Protect is off >> [17179594.208000] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00 >> [17179594.208000] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA >> [17179594.320000] sdc: sdc1 >> [17179594.364000] sd 1:0:0:0: [sdc] Attached SCSI disk >> [17179594.432000] sd 3:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB) >> [17179594.520000] sd 3:0:0:0: [sdd] Write Protect is off >> [17179594.584000] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 >> [17179594.584000] sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA >> [17179594.696000] sd 3:0:0:0: [sdd] 1953525168 512-byte hardware sectors (1000205 MB) >> [17179594.784000] sd 3:0:0:0: [sdd] Write Protect is off >> [17179594.844000] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 >> [17179594.844000] sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA >> [17179594.956000] sdd: sdd1 >> [17179595.004000] sd 3:0:0:0: [sdd] Attached SCSI disk >> [17179595.068000] sd 4:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB) >> [17179595.160000] sd 4:0:0:0: [sde] Write Protect is off >> [17179595.220000] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00 >> [17179595.220000] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA >> [17179595.332000] sd 4:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB) >> [17179595.424000] sd 4:0:0:0: [sde] Write Protect is off >> [17179595.484000] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00 >> [17179595.484000] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA >> [17179595.596000] sde: sde1 >> [17179595.640000] sd 4:0:0:0: [sde] Attached SCSI disk >> [17179596.980000] md: linear personality registered for level -1 >> [17179597.088000] md: multipath personality registered for level -4 >> [17179597.200000] md: raid0 personality registered for level 0 >> [17179597.312000] md: raid1 personality registered for level 1 >> [17179597.416000] xor: measuring software checksum speed >> [17179597.496000] 8regs : 943.000 MB/sec >> [17179597.568000] 8regs_prefetch: 933.000 MB/sec >> [17179597.644000] 32regs : 960.000 MB/sec >> [17179597.716000] 32regs_prefetch: 945.000 MB/sec >> [17179597.772000] xor: using function: 32regs (960.000 MB/sec) >> [17179597.848000] async_tx: api initialized (sync-only) >> [17179598.012000] raid6: int32x1 183 MB/s >> [17179598.128000] raid6: int32x2 230 MB/s >> [17179598.244000] raid6: int32x4 272 MB/s >> [17179598.360000] raid6: int32x8 213 MB/s >> [17179598.408000] raid6: using algorithm int32x4 (272 MB/s) >> [17179598.468000] md: raid6 personality registered for level 6 >> [17179598.536000] md: raid5 personality registered for level 5 >> [17179598.604000] md: raid4 personality registered for level 4 >> [17179598.896000] md: raid10 personality registered for level 10 >> [17179599.136000] md: bind<sdd1> >> [17179599.172000] md: bind<sde1> >> [17179599.208000] md: bind<sdc1> >> [17179599.324000] raid5: device sdc1 operational as raid disk 0 >> [17179599.392000] raid5: device sdd1 operational as raid disk 1 >> [17179599.460000] raid5: allocated 3176kB for md0 >> [17179599.512000] raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2 >> [17179599.608000] RAID5 conf printout: >> [17179599.648000] --- rd:3 wd:2 >> [17179599.684000] disk 0, o:1, dev:sdc1 >> [17179599.728000] disk 1, o:1, dev:sdd1 >> [17179600.672000] EXT3-fs: INFO: recovery required on readonly filesystem. >> [17179600.752000] EXT3-fs: write access will be enabled during recovery. >> [17179601.128000] kjournald starting. Commit interval 5 seconds >> [17179601.196000] EXT3-fs: recovery complete. >> [17179601.244000] EXT3-fs: mounted filesystem with ordered data mode. >> [17179603.920000] udevd version 125 started >> [17179604.124000] usb usb1: uevent >> [17179604.124000] usb 1-0:1.0: uevent >> [17179609.280000] Adding 489940k swap on /dev/sdb7. Priority:-1 extents:1 across:489940k >> [17179609.764000] EXT3 FS on sdb5, internal journal >> [17179610.860000] LASI 82596 driver - Revision: 1.30 >> [17179610.980000] loop: module loaded >> [17179616.900000] RAID5 conf printout: >> [17179616.944000] --- rd:3 wd:2 >> [17179616.980000] disk 0, o:1, dev:sdc1 >> [17179617.024000] disk 1, o:1, dev:sdd1 >> [17179617.072000] disk 2, o:1, dev:sde1 >> [17179617.116000] md: recovery of RAID array md0 >> [17179617.168000] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. >> [17179617.240000] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. >> [17179617.360000] md: using 128k window, over a total of 976759936 blocks. >> [17179632.056000] kjournald starting. Commit interval 5 seconds >> [17179632.140000] EXT3 FS on sdb8, internal journal >> [17179632.196000] EXT3-fs: mounted filesystem with ordered data mode. >> [17179632.312000] kjournald starting. Commit interval 5 seconds >> [17179632.392000] EXT3 FS on sdb4, internal journal >> [17179632.448000] EXT3-fs: mounted filesystem with ordered data mode. >> [17179632.556000] kjournald starting. Commit interval 5 seconds >> [17179632.632000] EXT3 FS on sdb6, internal journal >> [17179632.688000] EXT3-fs: mounted filesystem with ordered data mode. >> [17179632.816000] kjournald starting. Commit interval 5 seconds >> [17179632.920000] EXT3 FS on sda1, internal journal >> [17179632.976000] EXT3-fs: mounted filesystem with ordered data mode. >> [17179633.408000] kjournald starting. Commit interval 5 seconds >> [17179633.512000] EXT3 FS on md0, internal journal >> [17179633.564000] EXT3-fs: mounted filesystem with ordered data mode. >> [17179639.712000] eth0: Setting full-duplex based on MII#1 link partner capability of 45e1. > > -- Dirk Van Hertem Dirk.VanHertem@esat.kuleuven.be Electrical Engineering Department http://www.esat.kuleuven.be/electa K.U. Leuven, ESAT-ELECTA tel: +32-16-32.18.95 10, Kasteelpark Arenberg, B-3001 Heverlee fax: +32-16-32.19.85 [-- Attachment #2: ser_pim.txt --] [-- Type: text/plain, Size: 7717 bytes --] Main Menu: Enter command > ser pim PROCESSOR PIM INFORMATION ----------------- Processor 0 HPMC Information ------------------ Timestamp = Fri May 15 20:19:22 GMT 2009 (20:09:05:15:20:19:22) HPMC Chassis Codes = 2cbf0 2500b 2cbf2 2cbfc General Registers 0 - 31 00-03 0000000000000000 000000001021d000 00000000000fca38 000000004df4c074 04-07 0000000000000000 000000004df4d380 000000000008073c 000000004df4c000 08-11 0000000000102598 0000000000013000 0000000000000000 0000000000013000 12-15 0000000000000000 0000000000000005 00000000001ca06c 00000000f0400004 16-19 000000004e1f8540 00000000f000017c 00000000f0000174 000000004df4c000 20-23 0000000000000001 0000000000066004 0000000000066000 000000004f497b30 24-27 0000000000000001 ffffffff80000000 000000004df4c074 00000000103850f0 28-31 0000000001000000 0000000000066380 000000004e1f8bc0 0000000000000004 <Press any key to continue (q to quit)> Control Registers 0 - 31 00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 08-11 000000000000003a 0000000000000000 00000000000000c0 0000000000000001 12-15 0000000000000000 0000000000000000 000000000010d000 00000000fe000000 16-19 000002aeeacf3b1a 0000000000000000 0000000000070464 000000000ff6009c 20-23 00000000a627ffd2 0000000008066004 000000ff0004ff0e 0000000080000000 24-27 00000000003c7000 000000003e402000 0000000000044021 00000000f0412000 28-31 0000000055555555 0000000055555555 000000004e1f8000 0000000011111111 Space Registers 0 - 7 00-03 00000000 00000000 00000000 0000001d 04-07 00000000 00000000 00000000 00000000 <Press any key to continue (q to quit)> IIA Space = 0x0000000000000000 IIA Offset = 0x0000000000070468 Check Type = 0x20000000 CPU State = 0x9e000004 Cache Check = 0x00000000 TLB Check = 0x00000000 Bus Check = 0x0030103b Assists Check = 0x00000000 Assist State = 0x00000000 Path Info = 0x00000000 System Responder Address = 0x000000fff4820004 System Requestor Address = 0xfffffffffffa0000 Floating-Point Registers 0 - 31 00-03 0000001f00000000 0000000000000000 0000000000000000 0000000000000000 04-07 7ff7ffffffffffff 41d25c49fb800000 000000058c000000 7ff7ffffffffffff 08-11 000000000000fe9c 000000024f415bc0 4f415bc800000000 1056c58000000003 12-15 5555555555555555 5555555555555555 5555555555555555 5555555555555555 16-19 5555555555555555 5555555555555555 5555555555555555 5555555555555555 20-23 5555555555555555 5555555555555555 0000008099999e4f 003a2c6a00000000 24-27 0000000000000000 000000001d163500 00001c3e103d2b48 1022b1c810263260 28-31 ffffffff00001c3e 103990b01018c6ec 103990b000000190 4f42428010115ed0 <Press any key to continue (q to quit)> '9000/785 B,C,J Workstation Unarchitected (per-CPU)', rev 1, 140 bytes: Check Summary = 0xcb81041008000000 Available Memory = 0x0000000040000000 CPU Diagnose Register 2 = 0x0301000000000004 CPU Status Register 0 = 0x2420c20000000000 CPU Status Register 1 = 0x8002000000000000 SADD LOG = 0xc100f0fff4820004 Read Short LOG = 0xc1a0f0fff4820004 ERROR_STATUS = 0x0000000000100010 MEM_ADDR = 0x000001ff3fffffff MEM_SYND = 0x0000000000000000 MEM_ADDR_CORR = 0x000001ff3fffffff MEM_SYND_CORR = 0x0000000000000000 RUN_DATA_HIGH = 0xc1bff0fffed08040 RUN_DATA_LOW = 0xc1bff0fffed08040 RUN_CTRL = 0x0000021c00001418 RUN_ADDR = 0xc1bff0fffed08040 System Responder Path = 0x00ffffff0a010400 HPMC PIM Analysis Information: Timestamp = Fri May 15 20:19:22 GMT 2009 (20:09:05:15:20:19:22) '9000/785 B,C,J Workstation HPMC PIM Analysis (per-CPU)', rev 0, 1304 bytes: A Data I/O Fetch Timeout occurred while CPU 0 was requesting information from a device at the path 10/1/4/0 (PCI slot 4). Memory/IO Controller Error Analysis Information: The Memory/IO Controller only observed the Broadcast Error. It did not log any additional information about the HPMC. <Press any key to continue (q to quit)> ----------------- Processor 0 LPMC Information ------------------ Check Type = 0x00000000 I/D Cache Parity Info = 0x00000000 Cache Check = 0x00000000 TLB Check = 0x00000000 Bus Check = 0x00000000 Assists Check = 0x00000000 Assist State = 0x00000000 Path Info = 0x00000000 System Responder Address = 0x0000000000000000 System Requestor Address = 0x0000000000000000 ----------------- Processor 0 TOC Information ------------------- General Registers 0 - 31 00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 08-11 0000000000000000 0000000000000000 0000000000000000 0000000000000000 12-15 0000000000000000 0000000000000000 0000000000000000 0000000000000000 16-19 0000000000000000 0000000000000000 0000000000000000 0000000000000000 20-23 0000000000000000 0000000000000000 0000000000000000 0000000000000000 24-27 0000000000000000 0000000000000000 0000000000000000 0000000000000000 28-31 0000000000000000 0000000000000000 0000000000000000 0000000000000000 <Press any key to continue (q to quit)> <Press any key to continue (q to quit)> Control Registers 0 - 31 00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 08-11 0000000000000000 0000000000000000 0000000000000000 0000000000000000 12-15 0000000000000000 0000000000000000 0000000000000000 0000000000000000 16-19 0000000000000000 0000000000000000 0000000000000000 0000000000000000 20-23 0000000000000000 0000000000000000 0000000000000000 0000000000000000 24-27 0000000000000000 0000000000000000 0000000000000000 0000000000000000 28-31 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Space Registers 0 - 7 00-03 00000000 00000000 00000000 00000000 04-07 00000000 00000000 00000000 00000000 IIA Space = 0x0000000000000000 IIA Offset = 0x0000000000000000 CPU State = 0x00000000 <Press any key to continue (q to quit)> Memory Error Log Information: Timestamp = Fri May 15 20:19:22 GMT 2009 (20:09:05:15:20:19:22) '9000/785 B,C,J Workstation Memory Error Log', rev 0, 64 bytes: No memory errors logged I/O Module Error Log Information: Timestamp = Fri May 15 20:19:22 GMT 2009 (20:09:05:15:20:19:22) '9000/785 B,C,J Workstation IO Error Log', rev 0, 228 bytes: Rope Word1 Word2 Word3 ------ ------------ ------------ 0 0x00000000 0x0e0cc009 0x00000000fed30048 1 ---------- 0x1e0cc2a9 ------------------ 2 ---------- 0x2e0cc009 ------------------ 3 ---------- 0x3e0cc009 ------------------ 4 ---------- 0x4e0cc009 ------------------ 5 ---------- 0x5e0cc009 ------------------ 6 ---------- 0x6e0cc009 ------------------ 7 ---------- 0x7e0cc009 ------------------ Main Menu: Enter command > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: random freezes B2000 running debian hppa lenny 2009-05-15 22:40 ` Dirk Van Hertem @ 2009-05-18 3:04 ` Grant Grundler 2009-05-18 9:34 ` Dirk Van Hertem 0 siblings, 1 reply; 6+ messages in thread From: Grant Grundler @ 2009-05-18 3:04 UTC (permalink / raw) To: Dirk Van Hertem; +Cc: Grant Grundler, linux-parisc On Sat, May 16, 2009 at 12:40:31AM +0200, Dirk Van Hertem wrote: > Dear Grant, > Dear linux-parisc enthousiasts, > > Sorry for the late reply: in the last week, my vt220 terminal died and > the power supply of my old (i386) server died as well, so I was busy > with other things. No problem. > I attached the "ser pim" output to this email, I hope it helps. If you > need any other information, please ask, I hope I'll be more responsive > next time... HPMC Chassis Codes = 2cbf0 2500b 2cbf2 2cbfc Looking at: ftp://ftp.parisc-linux.org/docs/platforms/A2375-90004.pdf CBF0 HPMC handling initiated. CBF2 Invalid length for OS HPMC handler CBFC Branch to OS HPMC failed Just means the linux HPMC handler didn't get called. Hrm. This worked once upon a time and I thought got fixed 6-8 months ago. Next thing I look at is: RUN_ADDR = 0xc1bff0fffed08040 So whatever is at 0xfffed08040 (40 bit addresses physically) was the either the victim or the culprit. Often this is a MMIO BAR plus some offset (probably 0x40). I suggest looking in the Controller driver for that offset and where it's used in the initialization System Responder Path = 0x00ffffff0a010400 This is supposed to match the HPA (Host Phys Address) of one of the devices that is listed at the beginning of the parisc-linux boot. I'm not sure it' accurate though. And then the last part of the PIM that's interesting basically confirms what we have been guessing: '9000/785 B,C,J Workstation HPMC PIM Analysis (per-CPU)', rev 0, 1304 bytes: A Data I/O Fetch Timeout occurred while CPU 0 was requesting information from a device at the path 10/1/4/0 (PCI slot 4). I forgot how to check if the "I/O Fetch Timeout" occurred because the IOMMU already went "fatal" (DMA was attempted to an unmapped address). FYI, I also found the C3000 service manual here: http://sysdoc.doors.ch/HP/lpv38336.pdf and uploaded a copy to: ftp://ftp.parisc-linux.org/docs/platforms/c3000-service.pdf TODO: add an entry to http://www.parisc-linux.org/documentation/ hth, grant ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: random freezes B2000 running debian hppa lenny 2009-05-18 3:04 ` Grant Grundler @ 2009-05-18 9:34 ` Dirk Van Hertem 2009-05-18 16:35 ` Grant Grundler 0 siblings, 1 reply; 6+ messages in thread From: Dirk Van Hertem @ 2009-05-18 9:34 UTC (permalink / raw) To: Grant Grundler; +Cc: linux-parisc Hello Grant, Thank you for the response. I am sorry to say, but I more or less understand your email, yet I have no idea what to do with it... How do I proceed to get this fixed? I am willing to learn something about debugging, but I would need someone to hold my hand (I do not know C, I have only a basic understanding on how the kernel works,...). I have the impression that the problem is not gigantic, but might be something simple to solve, maybe even just patching the sata_promise.c file? Yet, I do not have an idea where and how to start looking... I can give you access to the machine if that would help (note that this would last only one hour or so, than it will hang automatically and I would need to reboot it ;). So my questions are: * Is this something that can be solved? (in a reasonable time frame, I want to use the hard disks for storage ;-)) * by me? (If so, how?) * Must I forward this to the maintainers of this promise card within the kernel, or is this a parisc thing? >> I attached the "ser pim" output to this email, I hope it helps. If you >> need any other information, please ask, I hope I'll be more responsive >> next time... > > HPMC Chassis Codes = 2cbf0 2500b 2cbf2 2cbfc > > Looking at: > ftp://ftp.parisc-linux.org/docs/platforms/A2375-90004.pdf > > CBF0 HPMC handling initiated. > CBF2 Invalid length for OS HPMC handler > CBFC Branch to OS HPMC failed > > Just means the linux HPMC handler didn't get called. Hrm. This worked once > upon a time and I thought got fixed 6-8 months ago. > > Next thing I look at is: > RUN_ADDR = 0xc1bff0fffed08040 > > So whatever is at 0xfffed08040 (40 bit addresses physically) > was the either the victim or the culprit. Often this is a MMIO BAR > plus some offset (probably 0x40). I suggest looking in the > Controller driver for that offset and where it's used in the > initialization > In sata_promise.c, there is the following code: /* per-port ATA register offsets (from ap->ioaddr.cmd_addr) */ PDC_PKT_SUBMIT = 0x40, /* Command packet pointer addr*/ This PDC_PKT_SUBMIT is than used again here: static void pdc_packet_start(struct ata_queued_cmd *qc) { struct ata_port *ap = qc->ap; struct pdc_port_priv *pp = ap->private_data; void __iomem *host_mmio = ap->host->iomap[PDC_MMIO_BAR]; void __iomem *ata_mmio = ap->ioaddr.cmd_addr; unsigned int port_no = ap->port_no; u8 seq = (u8) (port_no + 1); VPRINTK("ENTER, ap %p\n", ap); writel(0x00000001, host_mmio + (seq * 4)); readl(host_mmio + (seq * 4)); /* flush */ pp->pkt[2] = seq; wmb(); /* flush PRD, pkt writes */ writel(pp->pkt_dma, ata_mmio + PDC_PKT_SUBMIT); readl(ata_mmio + PDC_PKT_SUBMIT); /* flush */ } This function is then used in case a ATA_PROT_DMA is called. It seems like that this might be the spot where the problem might be (as you indicate further down). I will test (just for the sake of it) if it will stop crashing if I turn DMA down (if that is possible with a raid device) > > System Responder Path = 0x00ffffff0a010400 > > This is supposed to match the HPA (Host Phys Address) of one of the > devices that is listed at the beginning of the parisc-linux boot. > I'm not sure it' accurate though. I will try to check that this evening (I hope this will be something that will appear in my minicom screen? > > And then the last part of the PIM that's interesting basically confirms > what we have been guessing: > > '9000/785 B,C,J Workstation HPMC PIM Analysis (per-CPU)', rev 0, 1304 bytes: > > A Data I/O Fetch Timeout occurred while CPU 0 was > requesting information from a device at the path 10/1/4/0 (PCI slot 4). > > I forgot how to check if the "I/O Fetch Timeout" occurred because > the IOMMU already went "fatal" (DMA was attempted to an unmapped address). > > > FYI, I also found the C3000 service manual here: > http://sysdoc.doors.ch/HP/lpv38336.pdf > > and uploaded a copy to: > ftp://ftp.parisc-linux.org/docs/platforms/c3000-service.pdf > > TODO: add an entry to http://www.parisc-linux.org/documentation/ > > hth, > grant Thanks again, Dirk -- Dirk Van Hertem Dirk.VanHertem@esat.kuleuven.be Electrical Engineering Department http://www.esat.kuleuven.be/electa K.U. Leuven, ESAT-ELECTA tel: +32-16-32.18.95 10, Kasteelpark Arenberg, B-3001 Heverlee fax: +32-16-32.19.85 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: random freezes B2000 running debian hppa lenny 2009-05-18 9:34 ` Dirk Van Hertem @ 2009-05-18 16:35 ` Grant Grundler 0 siblings, 0 replies; 6+ messages in thread From: Grant Grundler @ 2009-05-18 16:35 UTC (permalink / raw) To: Dirk Van Hertem; +Cc: Grant Grundler, linux-parisc On Mon, May 18, 2009 at 11:34:27AM +0200, Dirk Van Hertem wrote: > Hello Grant, > > Thank you for the response. > > I am sorry to say, but I more or less understand your email, yet I have > no idea what to do with it... > > How do I proceed to get this fixed? 1) Locate the use of 0x40 offset in the Promise SATA controller driver. 2) Narrow down which uses are likely to have been the "victim" 3) Look for dma map/unmap "leaks" - use of an address for DMA *after* it's been unmapped OR before it's been mapped. > I am willing to learn something > about debugging, but I would need someone to hold my hand (I do not know > C, I have only a basic understanding on how the kernel works,...). I > have the impression that the problem is not gigantic, but might be > something simple to solve, maybe even just patching the sata_promise.c > file? Yet, I do not have an idea where and how to start looking... Yes, I think you can read the sata_promise.c. But after first glance, I'm afraid this is not a trivial problem...but you can do some code review to look for unmatched or missing dma_map_sg() and dma_unmap_sg() calls. Here's a start of the steps above: 1) Locate the use of 0x40 offset in the Promise SATA controller driver. 56 /* host register offsets (from host->iomap[PDC_MMIO_BAR]) */ 57 PDC_INT_SEQMASK = 0x40, /* Mask of asserted SEQ INTs */ 58 PDC_FLASH_CTL = 0x44, /* Flash control register */ ... 811 static irqreturn_t pdc_interrupt(int irq, void *dev_instance) 812 { ... 844 /* reading should also clear interrupts */ 845 mask = readl(host_mmio + PDC_INT_SEQMASK); ... [ does some bit frobbing ] 858 writel(mask, host_mmio + PDC_INT_SEQMASK); So the "victim" seems to be a normal read from a register. Unlikely to be the problem. Likely *before* the interrupt was delivered, had attempted to do DMA to an invalid DMA address. Since the IOMMU lookup fails, the IOMMU goes "fatal" and stops forwarding MMIO traffic to the PCI busses (including the Promise card in slot 4). > I can give you access to the machine if that would help (note that this > would last only one hour or so, than it will hang automatically and I > would need to reboot it ;). It won't help since the "ideal" way to debug this would be to attach a PCI analyzer, collect a trace of the failure, then examine all the DMA transactions preceeding the failure. The less ideal way is to stare at the code, a Promise SATA Programmers Guide, and figure out how the device is supposed to work. Also, I'd be looking extra careful at the error handling paths. Thus are notorious for not cleaning up correctly. In this case, "canceling" an IO that is still in flight. Driver has to guarantee the SATA controller will NEVER DMA to a chunk of memory that is not mapped for DMA. > So my questions are: > * Is this something that can be solved? (in a reasonable time frame, I > want to use the hard disks for storage ;-)) > * by me? (If so, how?) > * Must I forward this to the maintainers of this promise card within the > kernel, or is this a parisc thing? parisc exposes the bug. I'm pretty sure this is a sata_promise driver bug. Forwarding to the promise maintainer and CC'ing linux-ide@vger.kernel.org would probably be the best thing to start with. You can still take a look through the code. > >> I attached the "ser pim" output to this email, I hope it helps. If you > >> need any other information, please ask, I hope I'll be more responsive > >> next time... > > > > HPMC Chassis Codes = 2cbf0 2500b 2cbf2 2cbfc > > > > Looking at: > > ftp://ftp.parisc-linux.org/docs/platforms/A2375-90004.pdf > > > > CBF0 HPMC handling initiated. > > CBF2 Invalid length for OS HPMC handler > > CBFC Branch to OS HPMC failed > > > > Just means the linux HPMC handler didn't get called. Hrm. This worked once > > upon a time and I thought got fixed 6-8 months ago. > > > > Next thing I look at is: > > RUN_ADDR = 0xc1bff0fffed08040 > > > > So whatever is at 0xfffed08040 (40 bit addresses physically) > > was the either the victim or the culprit. Often this is a MMIO BAR > > plus some offset (probably 0x40). I suggest looking in the > > Controller driver for that offset and where it's used in the > > initialization > > > > In sata_promise.c, there is the following code: > > /* per-port ATA register offsets (from ap->ioaddr.cmd_addr) */ > > PDC_PKT_SUBMIT = 0x40, /* Command packet pointer addr*/ Good! I stopped looking for 0x40 once I found PDC_INT_SEQMASK. You could be right that this use of 0x40 is the victim. It's quite possible. But the scenario I describe is still the same (DMA to invalid address and then MMIO fails). > This PDC_PKT_SUBMIT is than used again here: > > static void pdc_packet_start(struct ata_queued_cmd *qc) > { > struct ata_port *ap = qc->ap; > struct pdc_port_priv *pp = ap->private_data; > void __iomem *host_mmio = ap->host->iomap[PDC_MMIO_BAR]; > void __iomem *ata_mmio = ap->ioaddr.cmd_addr; > unsigned int port_no = ap->port_no; > u8 seq = (u8) (port_no + 1); > > VPRINTK("ENTER, ap %p\n", ap); > > writel(0x00000001, host_mmio + (seq * 4)); > readl(host_mmio + (seq * 4)); /* flush */ > > pp->pkt[2] = seq; > wmb(); /* flush PRD, pkt writes */ > writel(pp->pkt_dma, ata_mmio + PDC_PKT_SUBMIT); > readl(ata_mmio + PDC_PKT_SUBMIT); /* flush */ > } > > This function is then used in case a ATA_PROT_DMA is called. > It seems like that this might be the spot where the problem might be (as > you indicate further down). I will test (just for the sake of it) if it > will stop crashing if I turn DMA down (if that is possible with a raid > device) Things that can be tried: o try to limit which buffers get used, o leave more stale DMA mappings open longer (risks memory corruption) o dump additional info (e.g. last 5 dma_map/dma_unmap parameters) in the HPMC handler (which currently isn't working in the kernel you used). I don't know if these are beyond you ability. But "DMA mapping code" in this case refers to drivers/parisc/sba_iommu.c . Take a look at that so you have an idea of what is involved with DMA map/unmap code. > > System Responder Path = 0x00ffffff0a010400 > > > > This is supposed to match the HPA (Host Phys Address) of one of the > > devices that is listed at the beginning of the parisc-linux boot. > > I'm not sure it' accurate though. > > I will try to check that this evening (I hope this will be something > that will appear in my minicom screen? Yes, it should be in the console output someplace. > > > > > And then the last part of the PIM that's interesting basically confirms > > what we have been guessing: > > > > '9000/785 B,C,J Workstation HPMC PIM Analysis (per-CPU)', rev 0, 1304 bytes: > > > > A Data I/O Fetch Timeout occurred while CPU 0 was > > requesting information from a device at the path 10/1/4/0 (PCI slot 4). I forgot to mention the "I/O Module Error Log" means something too: Rope Word1 Word2 Word3 ------ ------------ ------------ 0 0x00000000 0x0e0cc009 0x00000000fed30048 It would be worth finding out what "Word3" (hint: search parisc-linux mail archives) means again. cheers, grant > > > > I forgot how to check if the "I/O Fetch Timeout" occurred because > > the IOMMU already went "fatal" (DMA was attempted to an unmapped address). > > > > > > FYI, I also found the C3000 service manual here: > > http://sysdoc.doors.ch/HP/lpv38336.pdf > > > > and uploaded a copy to: > > ftp://ftp.parisc-linux.org/docs/platforms/c3000-service.pdf > > > > TODO: add an entry to http://www.parisc-linux.org/documentation/ > > > > hth, > > grant > > Thanks again, > > Dirk > > -- > Dirk Van Hertem Dirk.VanHertem@esat.kuleuven.be > Electrical Engineering Department http://www.esat.kuleuven.be/electa > K.U. Leuven, ESAT-ELECTA tel: +32-16-32.18.95 > 10, Kasteelpark Arenberg, B-3001 Heverlee fax: +32-16-32.19.85 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-05-18 16:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <49FB108B.9030803@ieee.org>
2009-05-03 11:25 ` random freezes B2000 running debian hppa lenny Grant Grundler
2009-05-03 23:07 ` Dirk Van Hertem
2009-05-15 22:40 ` Dirk Van Hertem
2009-05-18 3:04 ` Grant Grundler
2009-05-18 9:34 ` Dirk Van Hertem
2009-05-18 16:35 ` Grant Grundler
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.