* PROBLEM: 2.6.6 grinds to a halt with moderate I/O
@ 2004-06-15 15:47 Micah Anderson
2004-06-15 16:00 ` William Lee Irwin III
2004-06-16 7:43 ` Philippe Gramoullé
0 siblings, 2 replies; 5+ messages in thread
From: Micah Anderson @ 2004-06-15 15:47 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 13080 bytes --]
Following the format from REPORTING-BUGS please see the below information.
I unfortunately cannot subscribe to the list, but will follow the thread. I
have searched high and low, read a number of threads somewhat tangential to
this problem, and asked a few times in #kernelnewbies before I got to my
wits end and now will try here. I really appreciate any insight anyone has,
and will be happy to provide more information or additional tests
1. When doing moderate I/O on a 2.6.6 system the machine becomes unusable.
2. I found that with HIGHMEM support compiled into the kernel, when I did a
cp -vr /var /usr/tmp it would work fine until it got about halfway through the
large ldap.log file (approximately 500 megs) when the system would no longer
be able to fork new processes. Your existing shell would function, but
if you tried to run top, free, etc. it would hang. vmstat 1 would print
the first line, but never continue. I ran a million different kernel configs
to try and isolate things, and I thought I had it nailed down with passing
apic=off to the kernel at boot because the large logfile copy test would
pass, but when rsyncing maildirs tonight the same problem appeared. Early
in my tests I thought the problem was dm-crypt, but the problem existed
even when no encrypted filesystems were involved, and existed when I
removed dm-crypt support from the kernel. Disabling HIGHMEM support seems
to make the problem go away.
Machine requires a powercycle to get it back. Memory was memtested for over
24 hours. Machine is a HP netserver lh1000r with megaraid controller, no IDE.
3. kernel, i/o
4. Linux version 2.6.6 (root@willow) (gcc version 3.3.3 (Debian 20040422)) #9 SMP Fri Jun 11 17:43:06 PDT 2004
5. No oops available
6. see above for reproducable test
7. Environment
7.1 Linux willow 2.6.6 #9 SMP Fri Jun 11 17:43:06 PDT 2004 i686 GNU/Linux
Gnu C 3.3.3
Gnu make 3.80
binutils 2.14.90.0.7
util-linux 2.12
mount 2.12
module-init-tools 3.0-pre10
e2fsprogs 1.35
PPP 2.4.2
Linux C Library 2.3.2
Dynamic linker (ldd) 2.3.2
Procps 3.2.1
Net-tools 1.60
Console-tools 0.2.3
Sh-utils 5.0.91
7.2 cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 6
cpu MHz : 933.936
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse
bogomips : 1843.20
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 10
cpu MHz : 933.936
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse
bogomips : 1863.68
7.3 No module support in kernel
7.4 cat /proc/ioports
0000-001f : dma1
0020-0021 : pic1
0040-005f : timer
0060-006f : keyboard
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
01f0-01f7 : ide0
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial
0cf8-0cff : PCI conf1
1400-14ff : 0000:00:07.0
1800-183f : 0000:00:02.0
1800-183f : e100
1840-187f : 0000:00:08.0
1840-187f : e100
1880-188f : 0000:00:0f.1
2000-20ff : 0000:01:05.0
2400-24ff : 0000:01:05.1
3000-3fff : PCI Bus #02
3000-30ff : 0000:02:01.0
cat /proc/iomem
00000000-0009f7ff : System RAM
0009f800-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000ca7ff : Video ROM
000ca800-000cbfff : Adapter ROM
000cc000-000cd7ff : Adapter ROM
000cd800-000cdfff : Adapter ROM
000ce000-000cfdff : Adapter ROM
000f0000-000fffff : System ROM
00100000-7fffffff : System RAM
00100000-002f88b1 : Kernel code
002f88b2-0038725f : Kernel data
e8001000-e8001fff : 0000:00:02.0
e8001000-e8001fff : e100
e8002000-e8002fff : 0000:00:07.0
e8003000-e8003fff : 0000:00:08.0
e8003000-e8003fff : e100
e8004000-e8004fff : 0000:00:0f.2
e8100000-e81fffff : 0000:00:02.0
e8100000-e81fffff : e100
e8200000-e82fffff : 0000:00:08.0
e8200000-e82fffff : e100
e9000000-e9ffffff : 0000:00:07.0
ea000000-ea001fff : 0000:01:05.0
ea002000-ea003fff : 0000:01:05.1
ea004000-ea0043ff : 0000:01:05.0
ea004400-ea0047ff : 0000:01:05.1
ea100000-ea1fffff : PCI Bus #02
ea100000-ea100fff : 0000:02:01.0
f0000000-f7ffffff : PCI Bus #02
f0000000-f7ffffff : PCI Bus #03
f0000000-f7ffffff : 0000:03:00.0
f0000000-f000007f : megaraid
fec00000-fec0ffff : reserved
fee00000-fee00fff : reserved
fff80000-ffffffff : reserved
7.5 lspci -vvv
0000:00:00.0 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
Latency: 48, Cache Line Size: 0x08 (32 bytes)
0000:00:00.1 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32, Cache Line Size: 0x08 (32 bytes)
0000:00:02.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
Subsystem: Hewlett-Packard Company NetServer 10/100TX
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64 (2000ns min, 14000ns max), Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 22
Region 0: Memory at e8001000 (32-bit, non-prefetchable)
Region 1: I/O ports at 1800 [size=64]
Region 2: Memory at e8100000 (32-bit, non-prefetchable) [size=1M]
Capabilities: <available only to root>
0000:00:07.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 65) (prog-if 00 [VGA])
Subsystem: Hewlett-Packard Company: Unknown device 10e1
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 66 (2000ns min), Cache Line Size: 0x08 (32 bytes)
Region 0: Memory at e9000000 (32-bit, non-prefetchable)
Region 1: I/O ports at 1400 [size=256]
Region 2: Memory at e8002000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <available only to root>
0000:00:08.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
Subsystem: Hewlett-Packard Company NetServer 10/100TX
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64 (2000ns min, 14000ns max), Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 23
Region 0: Memory at e8003000 (32-bit, non-prefetchable)
Region 1: I/O ports at 1840 [size=64]
Region 2: Memory at e8200000 (32-bit, non-prefetchable) [size=1M]
Capabilities: <available only to root>
0000:00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 4f)
Subsystem: ServerWorks OSB4 South Bridge
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR+ <PERR-
Latency: 0
0000:00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller (prog-if 8a [Master SecP PriP])
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Region 4: I/O ports at 1880 [size=16]
0000:00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 04) (prog-if 10 [OHCI])
Subsystem: ServerWorks OSB4/CSB5 OHCI USB Controller
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Interrupt: pin A routed to IRQ 33
Region 0: Memory at e8004000 (32-bit, non-prefetchable)
0000:01:02.0 PCI bridge: Intel Corp. 21154 PCI-to-PCI Bridge (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64
Bus: primary=01, secondary=02, subordinate=03, sec-latency=36
I/O behind bridge: 00003000-00003fff
Memory behind bridge: ea100000-ea1fffff
Prefetchable memory behind bridge: 00000000f0000000-00000000f7f00000
Expansion ROM at 00003000 [disabled] [size=4K]
BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
Capabilities: <available only to root>
0000:01:05.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 Ultra3 SCSI Adapter (rev 01)
Subsystem: Hewlett-Packard Company: Unknown device 60b0
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 72 (4250ns min, 4500ns max), Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 24
Region 0: I/O ports at 2000
Region 1: Memory at ea004000 (64-bit, non-prefetchable) [size=1K]
Region 3: Memory at ea000000 (64-bit, non-prefetchable) [size=8K]
Capabilities: <available only to root>
0000:01:05.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 Ultra3 SCSI Adapter (rev 01)
Subsystem: Hewlett-Packard Company: Unknown device 60b0
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 72 (4250ns min, 4500ns max), Cache Line Size: 0x08 (32 bytes)
Interrupt: pin B routed to IRQ 25
Region 0: I/O ports at 2400
Region 1: Memory at ea004400 (64-bit, non-prefetchable) [size=1K]
Region 3: Memory at ea002000 (64-bit, non-prefetchable) [size=8K]
Capabilities: <available only to root>
0000:02:00.0 PCI bridge: Intel Corp. 21154 PCI-to-PCI Bridge (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64
Bus: primary=02, secondary=03, subordinate=03, sec-latency=36
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: fff00000-000fffff
Prefetchable memory behind bridge: 00000000f0000000-00000000f7f00000
BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
Capabilities: <available only to root>
0000:02:01.0 SCSI storage controller: QLogic Corp. ISP12160 Dual Channel Ultra3 SCSI Processor (rev 06)
Subsystem: American Megatrends Inc. QLA12160 on AMI MegaRAID
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64 (16000ns min), Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 29
Region 0: I/O ports at 3000
Region 1: Memory at ea100000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <available only to root>
0000:03:00.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 25)
Subsystem: Hewlett-Packard Company: Unknown device 60e8
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64, Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 16
Region 0: Memory at f0000000 (32-bit, prefetchable)
Capabilities: <available only to root>
7.6 cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: MegaRAID Model: LD 0 RAID5 140G Rev: K
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 05 Id: 11 Lun: 00
Vendor: SDR Model: GEM318 Rev: 0
Type: Processor ANSI SCSI revision: 02
micah@willow:/tmp$
7.7 Machine is a HP netserver lh1000r with megaraid controller, no IDE.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: PROBLEM: 2.6.6 grinds to a halt with moderate I/O 2004-06-15 15:47 PROBLEM: 2.6.6 grinds to a halt with moderate I/O Micah Anderson @ 2004-06-15 16:00 ` William Lee Irwin III 2004-06-15 18:19 ` Micah Anderson 2004-06-16 7:43 ` Philippe Gramoullé 1 sibling, 1 reply; 5+ messages in thread From: William Lee Irwin III @ 2004-06-15 16:00 UTC (permalink / raw) To: Micah Anderson; +Cc: linux-kernel On Tue, Jun 15, 2004 at 10:47:45AM -0500, Micah Anderson wrote: > Following the format from REPORTING-BUGS please see the below information. > I unfortunately cannot subscribe to the list, but will follow the thread. I > have searched high and low, read a number of threads somewhat tangential to > this problem, and asked a few times in #kernelnewbies before I got to my > wits end and now will try here. I really appreciate any insight anyone has, > and will be happy to provide more information or additional tests > 1. When doing moderate I/O on a 2.6.6 system the machine becomes unusable. > 2. I found that with HIGHMEM support compiled into the kernel, when I > did a cp -vr /var /usr/tmp it would work fine until it got about > halfway through the large ldap.log file (approximately 500 megs) when > the system would no longer be able to fork new processes. Your > existing shell would function, but if you tried to run top, free, etc. > it would hang. vmstat 1 would print the first line, but never > continue. I ran a million different kernel configs to try and isolate > things, and I thought I had it nailed down with passing apic=off to > the kernel at boot because the large logfile copy test would > pass, but when rsyncing maildirs tonight the same problem appeared. Early > in my tests I thought the problem was dm-crypt, but the problem existed > even when no encrypted filesystems were involved, and existed when I > removed dm-crypt support from the kernel. Disabling HIGHMEM support seems > to make the problem go away. Thanks for the bugreport. I'm going to file this in the Debian BTS after I get the FPU fixes out. Could you send along a dmesg (/var/log/dmesg on Debian) and /proc/meminfo and /proc/cpuinfo at some point when you can log into the box? I'll also try to reproduce this. Thanks. -- wli ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PROBLEM: 2.6.6 grinds to a halt with moderate I/O 2004-06-15 16:00 ` William Lee Irwin III @ 2004-06-15 18:19 ` Micah Anderson 2004-06-15 18:30 ` William Lee Irwin III 0 siblings, 1 reply; 5+ messages in thread From: Micah Anderson @ 2004-06-15 18:19 UTC (permalink / raw) To: William Lee Irwin III, linux-kernel [-- Attachment #1: Type: text/plain, Size: 15475 bytes --] >Thanks for the bugreport. I'm going to file this in the Debian BTS >after I get the FPU fixes out. Could you send along a dmesg >(/var/log/dmesg on Debian) and /proc/meminfo and /proc/cpuinfo at some >point when you can log into the box? I'll also try to reproduce this. I am not sure why this would be filed in the Debian BTS, yes the underlying OS is Debian, but this is not a Debian Kernel, it is a vanilla 2.6.6 kernel that I compiled by hand. Please find attached the dmesg and the /proc/meminfo, the /proc/cpuinfo was already included in the original email. 1. dmesg Linux version 2.6.6 (root@willow) (gcc version 3.3.3 (Debian 20040422)) #10 SMP Tue Jun 15 09:25:44 PDT 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f800 (usable) BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e9400 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 0000000080000000 (usable) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) 1152MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f7580 On node 0 totalpages: 524288 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:16 HighMem zone: 294912 pages, LIFO batch:16 DMI 2.3 present. ACPI: Unable to locate RSDP Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: HP Product ID: LP 1Kr/2Kr APIC at: 0xFEE00000 Processor #0 6:8 APIC version 17 Processor #3 6:8 APIC version 17 I/O APIC #1 Version 17 at 0xFEC00000. I/O APIC #2 Version 17 at 0xFEC01000. Enabling APIC mode: Flat. Using 2 I/O APICs Processors: 2 Built 1 zonelists Kernel command line: apic=off root=/dev/sda1 ro Initializing CPU#0 CPU 0 irqstacks, hard=c03e4000 soft=c03e2000 PID hash table entries: 4096 (order 12: 32768 bytes) Detected 934.115 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Memory: 2076156k/2097152k available (1991k kernel code, 19820k reserved, 565k data, 368k init, 1179648k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 1843.20 BogoMIPS Dentry cache hash table entries: 262144 (order: 8, 1048576 bytes) Inode-cache hash table entries: 131072 (order: 7, 524288 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0387fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0387fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU serial number disabled. CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX CPU0: Intel Pentium III (Coppermine) stepping 06 per-CPU timeslice cutoff: 731.34 usecs. task migration cache decay timeout: 1 msecs. enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Booting processor 1/3 eip 2000 CPU 1 irqstacks, hard=c03e5000 soft=c03e3000 Initializing CPU#1 masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 1863.68 BogoMIPS CPU: After generic identify, caps: 0387fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0387fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU serial number disabled. CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel Pentium III (Coppermine) stepping 0a Total of 2 processors activated (3706.88 BogoMIPS). ENABLING IO-APIC IRQs Setting 1 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 1 ... ok. Setting 2 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 2 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 1-0, 1-5, 1-9, 1-10, 1-11, 1-15, 2-1, 2-2, 2-3, 2-4, 2-5, 2-10, 2-11, 2-12, 2-14, 2-15 not connected. ..TIMER: vector=0x31 pin1=-1 pin2=0 ...trying to set up timer (IRQ0) through the 8259A ... ..... (found pin 0) ...works. number of MP IRQ sources: 18. number of IO-APIC #1 registers: 16. number of IO-APIC #2 registers: 16. testing the IO APIC....................... IO APIC #1...... .... register #00: 01000000 ....... : physical APIC id: 01 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 000F0011 ....... : max redirection entries: 000F ....... : PRQ implemented: 0 ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 001 01 0 0 0 0 0 1 1 31 01 001 01 0 0 0 0 0 1 1 39 02 000 00 1 0 0 0 0 0 0 00 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 000 00 1 0 0 0 0 0 0 00 06 001 01 0 0 0 0 0 1 1 51 07 001 01 0 0 0 0 0 1 1 59 08 001 01 0 0 0 0 0 1 1 61 09 000 00 1 0 0 0 0 0 0 00 0a 000 00 1 0 0 0 0 0 0 00 0b 000 00 1 0 0 0 0 0 0 00 0c 001 01 0 0 0 0 0 1 1 69 0d 001 01 0 0 0 0 0 1 1 71 0e 001 01 0 0 0 0 0 1 1 79 0f 000 00 1 0 0 0 0 0 0 00 IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 000F0011 ....... : max redirection entries: 000F ....... : PRQ implemented: 0 ....... : IO APIC version: 0011 .... register #02: 0D000000 ....... : arbitration: 0D .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 001 01 1 1 0 1 0 1 1 81 01 000 00 1 0 0 0 0 0 0 00 02 000 00 1 0 0 0 0 0 0 00 03 000 00 1 0 0 0 0 0 0 00 04 000 00 1 0 0 0 0 0 0 00 05 000 00 1 0 0 0 0 0 0 00 06 001 01 1 1 0 1 0 1 1 89 07 001 01 1 1 0 1 0 1 1 91 08 001 01 1 1 0 1 0 1 1 99 09 001 01 1 1 0 1 0 1 1 A1 0a 000 00 1 0 0 0 0 0 0 00 0b 000 00 1 0 0 0 0 0 0 00 0c 000 00 1 0 0 0 0 0 0 00 0d 001 01 1 1 0 1 0 1 1 A9 0e 000 00 1 0 0 0 0 0 0 00 0f 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:0 IRQ1 -> 0:1 IRQ2 -> 0:2 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ16 -> 1:0 IRQ22 -> 1:6 IRQ23 -> 1:7 IRQ24 -> 1:8 IRQ25 -> 1:9 IRQ29 -> 1:13 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 933.0294 MHz. ..... host bus clock speed is 133.0327 MHz. checking TSC synchronization across 2 CPUs: passed. Brought up 2 CPUs NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfda11, last bus=3 PCI: Using configuration type 1 mtrr: v2.0 (20020519) Linux Plug and Play Support v0.97 (c) Adam Belay SCSI subsystem initialized PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) PCI: Discovered peer bus 01 PCI->APIC IRQ transform: (B0,I2,P0) -> 22 PCI->APIC IRQ transform: (B0,I8,P0) -> 23 PCI->APIC IRQ transform: (B0,I15,P0) -> 33 PCI->APIC IRQ transform: (B1,I5,P0) -> 24 PCI->APIC IRQ transform: (B1,I5,P1) -> 25 PCI->APIC IRQ transform: (B2,I1,P0) -> 29 PCI->APIC IRQ transform: (B3,I0,P0) -> 16 Machine check exception polling timer started. Starting balanced_irq highmem bounce pool size: 64 pages Initializing Cryptographic API isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A Using anticipatory io scheduler Floppy drive(s): fd0 is 1.44M FDC 0 is a National Semiconductor PC87306 e100: Intel(R) PRO/100 Network Driver, 3.0.17 e100: Copyright(c) 1999-2004 Intel Corporation e100: eth0: e100_probe: addr 0xe8001000, irq 22, MAC addr 00:30:6E:05:E9:D0 e100: eth1: e100_probe: addr 0xe8003000, irq 23, MAC addr 00:30:6E:05:E9:D1 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx hda: CD-224E, ATAPI CD/DVD-ROM drive hdc: IRQ probe failed (0xffffffba) hdc: IRQ probe failed (0xffffffba) hdd: IRQ probe failed (0xffffffba) hdd: IRQ probe failed (0xffffffba) ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hda: ATAPI 24X CD-ROM drive, 128kB Cache Uniform CD-ROM driver Revision: 3.20 megaraid: found 0x101e:0x1960:bus 3:slot 0:func 0 scsi0:Found MegaRAID controller at 0xf8804000, IRQ:16 megaraid: [K01.04:J01.01] detected 1 logical drives. megaraid: supports extended CDBs. megaraid: channel[0] is raid. megaraid: channel[1] is raid. scsi0 : LSI Logic MegaRAID K01.04 254 commands 16 targs 5 chans 7 luns scsi0: scanning scsi channel 0 for logical drives. Vendor: MegaRAID Model: LD 0 RAID5 140G Rev: K Type: Direct-Access ANSI SCSI revision: 02 scsi0: scanning scsi channel 1 for logical drives. scsi0: scanning scsi channel 2 for logical drives. scsi0: scanning scsi channel 4 [P0] for physical devices. scsi0: scanning scsi channel 5 [P1] for physical devices. Vendor: SDR Model: GEM318 Rev: 0 Type: Processor ANSI SCSI revision: 02 SCSI device sda: 286744576 512-byte hdwr sectors (146813 MB) sda: asking for cache data failed sda: assuming drive cache: write through sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 > Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0, type 0 Attached scsi generic sg1 at scsi0, channel 5, id 11, lun 0, type 3 mice: PS/2 mouse device common for all mice serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 input: AT Translated Set 2 keyboard on isa0060/serio0 md: linear personality registered as nr 1 md: raid0 personality registered as nr 2 md: raid1 personality registered as nr 3 md: raid5 personality registered as nr 4 raid5: measuring checksumming speed 8regs : 1704.000 MB/sec 8regs_prefetch: 1364.000 MB/sec 32regs : 900.000 MB/sec 32regs_prefetch: 796.000 MB/sec pIII_sse : 1900.000 MB/sec pII_mmx : 2340.000 MB/sec p5_mmx : 2504.000 MB/sec raid5: using function: pIII_sse (1900.000 MB/sec) md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 NET: Registered protocol family 2 IP: routing cache hash table of 16384 buckets, 128Kbytes TCP: Hash tables configured (established 524288 bind 65536) ip_conntrack version 2.1 (8192 buckets, 65536 max) - 300 bytes per conntrack ip_tables: (C) 2000-2002 Netfilter core team ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>. http://snowman.net/projects/ipt_recent/ arp_tables: (C) 2002 David S. Miller NET: Registered protocol family 1 NET: Registered protocol family 17 md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 368k freed Adding 979920k swap on /dev/sda6. Priority:-1 extents:1 EXT3 FS on sda1, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on sda5, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on sda7, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on sda8, internal journal EXT3-fs: mounted filesystem with ordered data mode. e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex 2. /proc/meminfo MemTotal: 2077312 kB MemFree: 1583540 kB Buffers: 27804 kB Cached: 374892 kB SwapCached: 0 kB Active: 65300 kB Inactive: 375884 kB HighTotal: 1179648 kB HighFree: 765056 kB LowTotal: 897664 kB LowFree: 818484 kB SwapTotal: 979920 kB SwapFree: 979920 kB Dirty: 64 kB Writeback: 0 kB Mapped: 51084 kB Slab: 41584 kB Committed_AS: 184572 kB PageTables: 1032 kB VmallocTotal: 114680 kB VmallocUsed: 788 kB VmallocChunk: 113892 kB On Tue, 15 Jun 2004, William Lee Irwin III wrote: > On Tue, Jun 15, 2004 at 10:47:45AM -0500, Micah Anderson wrote: > > Following the format from REPORTING-BUGS please see the below information. > > I unfortunately cannot subscribe to the list, but will follow the thread. I > > have searched high and low, read a number of threads somewhat tangential to > > this problem, and asked a few times in #kernelnewbies before I got to my > > wits end and now will try here. I really appreciate any insight anyone has, > > and will be happy to provide more information or additional tests > > 1. When doing moderate I/O on a 2.6.6 system the machine becomes unusable. > > 2. I found that with HIGHMEM support compiled into the kernel, when I > > did a cp -vr /var /usr/tmp it would work fine until it got about > > halfway through the large ldap.log file (approximately 500 megs) when > > the system would no longer be able to fork new processes. Your > > existing shell would function, but if you tried to run top, free, etc. > > it would hang. vmstat 1 would print the first line, but never > > continue. I ran a million different kernel configs to try and isolate > > things, and I thought I had it nailed down with passing apic=off to > > the kernel at boot because the large logfile copy test would > > pass, but when rsyncing maildirs tonight the same problem appeared. Early > > in my tests I thought the problem was dm-crypt, but the problem existed > > even when no encrypted filesystems were involved, and existed when I > > removed dm-crypt support from the kernel. Disabling HIGHMEM support seems > > to make the problem go away. > > Thanks for the bugreport. I'm going to file this in the Debian BTS > after I get the FPU fixes out. Could you send along a dmesg > (/var/log/dmesg on Debian) and /proc/meminfo and /proc/cpuinfo at some > point when you can log into the box? I'll also try to reproduce this. > > Thanks. > > > -- wli [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PROBLEM: 2.6.6 grinds to a halt with moderate I/O 2004-06-15 18:19 ` Micah Anderson @ 2004-06-15 18:30 ` William Lee Irwin III 0 siblings, 0 replies; 5+ messages in thread From: William Lee Irwin III @ 2004-06-15 18:30 UTC (permalink / raw) To: Micah Anderson; +Cc: linux-kernel At some point in the past, I wrote: >> Thanks for the bugreport. I'm going to file this in the Debian BTS >> after I get the FPU fixes out. Could you send along a dmesg >> (/var/log/dmesg on Debian) and /proc/meminfo and /proc/cpuinfo at some >> point when you can log into the box? I'll also try to reproduce this. On Tue, Jun 15, 2004 at 01:19:08PM -0500, Micah Anderson wrote: > I am not sure why this would be filed in the Debian BTS, yes the > underlying OS is Debian, but this is not a Debian Kernel, it is a > vanilla 2.6.6 kernel that I compiled by hand. The debian kernel team is migrating Debian's 2.6 as close to mainline as is possible within policy guidelines, so it'll be applicable to it hopefully in the next 24 hours. On Tue, Jun 15, 2004 at 01:19:08PM -0500, Micah Anderson wrote: > Please find attached the dmesg and the /proc/meminfo, the > /proc/cpuinfo was already included in the original email. Okay, thanks. I'll do some testing of copying large files shortly. -- wli ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: PROBLEM: 2.6.6 grinds to a halt with moderate I/O 2004-06-15 15:47 PROBLEM: 2.6.6 grinds to a halt with moderate I/O Micah Anderson 2004-06-15 16:00 ` William Lee Irwin III @ 2004-06-16 7:43 ` Philippe Gramoullé 1 sibling, 0 replies; 5+ messages in thread From: Philippe Gramoullé @ 2004-06-16 7:43 UTC (permalink / raw) To: Micah Anderson; +Cc: linux-kernel Hello Micah, Could you give more information on the Megaraid part of your setup ? Are the i/o made on a disk/partition controlled by the megaraid ? If no, does the same behavior occur on a plain scsi disk ? If yes, what kind of RAID level do you use ? how many disks ? Could you give megaraid firmware information as well as logical volume settings regarding read,write and cache policy. Also, is it a regression over previous kernels, like 2.6.5 or even earlier kernels ? I've been using 2.6.3-mm3 for weeks now with DELL hardware and a megaraid controller doing intensive i/o around the clock without any problems. The box has been just rock solid. Thanks, Philippe On Tue, 15 Jun 2004 10:47:45 -0500 Micah Anderson <micah@riseup.net> wrote: | | Following the format from REPORTING-BUGS please see the below information. | I unfortunately cannot subscribe to the list, but will follow the thread. I | have searched high and low, read a number of threads somewhat tangential to | this problem, and asked a few times in #kernelnewbies before I got to my | wits end and now will try here. I really appreciate any insight anyone has, | and will be happy to provide more information or additional tests | [snip] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-06-16 7:43 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-06-15 15:47 PROBLEM: 2.6.6 grinds to a halt with moderate I/O Micah Anderson 2004-06-15 16:00 ` William Lee Irwin III 2004-06-15 18:19 ` Micah Anderson 2004-06-15 18:30 ` William Lee Irwin III 2004-06-16 7:43 ` Philippe Gramoullé
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox