From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262319AbVCVEU3 (ORCPT ); Mon, 21 Mar 2005 23:20:29 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262361AbVCVEP3 (ORCPT ); Mon, 21 Mar 2005 23:15:29 -0500 Received: from smtp1.dejazzd.com ([66.109.229.7]:11440 "EHLO smtp1.dejazzd.com") by vger.kernel.org with ESMTP id S262384AbVCVD6h (ORCPT ); Mon, 21 Mar 2005 22:58:37 -0500 Message-ID: <423F5152.2010303@ser1.net> Date: Mon, 21 Mar 2005 17:57:22 -0500 From: Sean Russell User-Agent: Mozilla Thunderbird 1.0 (X11/20050226) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org Subject: 2.6.1[01] freeze on x86_64 Content-Type: multipart/mixed; boundary="------------000200040306070606030108" Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org This is a multi-part message in MIME format. --------------000200040306070606030108 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello, One liner: I'm getting mysterious (to me), almost random hard freezes of the kernel running 2.6.10 and 2.6.11. Kernel version: Linux version 2.6.11-gentoo-r3 (root@ender) (gcc version 3.4.2 (Gentoo Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)) Mark Nipper posted a message on March 5 regarding some mysterious kernel lockups which he didn't get a response to (I've contacted him about it). Since I'm having what I think is the same problem, I thought I'd post a message so he's not just a single lonely voice in the dark. Mark and I have similar set-ups. We're both running x86_64 kernels, and ReiserFS3. He's running Debian, I'm running Gentoo. We haven't compared kernel config files yet; it might mean something to him, but to be honest, I barely know enough to compile my own kernels and wouldn't know where to begin to look for the problem. Mark has only encountered this on 2.6.11, but I don't think he's tried any other kernel versions on x86_64; I get this problem on both 2.6.10 and 2.6.11. I didn't see the problem on 2.6.9. In both of our cases, the kernel is locking up, and requires a power cycle to get it back. We're not able to SSH into our machines, and we get no response from any of the input devices. Furthermore, even with full debugging turned on, there are no messages in the log file that appear to be related to the lockup. In my logs, the last message before the crash is always (that I've noticed) an ACPI error: acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] but this message appears a lot in my logs, so I think it is coincidence. For Mark, the last message was some ReiserFS message. Mark feels like the error is ReiserFS related, and I was pretty sure it was swap related, until I turned off all swap partitions and the problem still occurred. I *may* try converting all of my filesystems to something else if somebody knowledgeable thinks it could be the problem, but I'm guessing it is something deeper in; I've never seen a filesystem related problem that caused a lock-up like this. I still feel that this may be memory related. When I turn off swap, or when a drastically reduce my memory use, my laptop can run for hours, or even days with little use. On the other hand, it can freeze up after five minutes, even before KDE has finished loading completely, with the swap on. However, I haven't found a situation where it won't, eventually, lock up. But I can't really pin it down, so I don't know where the problem is. I haven't noticed the lockups without X, but I haven't run for any great length of time without X. I'm running the ATI proprietary drivers, but I even when I revert to the XOrg ATI drivers (non-proprietary), I still get the lockups. I'm really sorry that I can't provide more information; I'm usually not totally incompetent at narrowing down problems in software, but I have no idea where to even start looking for the problem here. If there are any things I should try that might provide more information, please let me know. I'm attaching my kernel config, plus all of the info from /proc that is suggested by the FAQ to be included. I'll be happy to recompile my kernel with other options, if I can get some hints at starting points; I doubt my changing flags at random will help much. Thanks, in advance. Sean Russell --------------000200040306070606030108 Content-Type: text/plain; name="cpuinfo.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="cpuinfo.txt" processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 4 model name : AMD Athlon(tm) 64 Processor 3400+ stepping : 10 cpu MHz : 801.849 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm 3dnowext 3dnow bogomips : 1572.86 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp --------------000200040306070606030108 Content-Type: text/plain; name="iomem.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="iomem.txt" 00000000-0009f7ff : System RAM 0009f800-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000cefff : Video ROM 000cf000-000cffff : Adapter ROM 000f0000-000fffff : System ROM 00100000-3feeffff : System RAM 00100000-003ae374 : Kernel code 003ae375-004e0d27 : Kernel data 3fef0000-3fef9fff : ACPI Tables 3fefa000-3fefffff : ACPI Non-volatile Storage 3ff00000-3fffffff : reserved 40000000-40000fff : 0000:00:05.0 40000000-40000fff : ipw2200 40001000-40001fff : 0000:00:0c.0 d0000000-d0003fff : 0000:00:06.0 d0004000-d0004fff : 0000:00:0e.0 d0005000-d0005fff : 0000:00:0e.0 d0006000-d0006fff : 0000:00:0e.1 d0007000-d0007fff : 0000:00:0e.1 d0008000-d00087ff : 0000:00:06.0 d0008000-d00087ff : ohci1394 d0008800-d00088ff : 0000:00:08.0 d0008800-d00088ff : r8169 d0008c00-d0008cff : 0000:00:10.3 d0008c00-d0008cff : ehci_hcd d0100000-d01fffff : PCI Bus #01 d0100000-d010ffff : 0000:01:00.0 d0100000-d010ffff : radeonfb d8000000-dfffffff : PCI Bus #01 d8000000-dfffffff : 0000:01:00.0 d8000000-dfffffff : radeonfb e0000000-efffffff : 0000:00:00.0 e0000000-efffffff : aperture fffe0000-ffffffff : reserved --------------000200040306070606030108 Content-Type: text/plain; name="ioports.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ioports.txt" 0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-006f : keyboard 0070-0077 : rtc 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial 0cf8-0cff : PCI conf1 1000-10ff : 0000:00:08.0 1000-10ff : r8169 1400-14ff : 0000:00:11.5 1400-14ff : VIA8233 1800-18ff : 0000:00:11.6 1c00-1c1f : 0000:00:10.0 1c00-1c1f : uhci_hcd 1c20-1c3f : 0000:00:10.1 1c20-1c3f : uhci_hcd 1c40-1c5f : 0000:00:10.2 1c40-1c5f : uhci_hcd 1c60-1c6f : 0000:00:11.1 1c60-1c67 : ide0 1c68-1c6f : ide1 2000-2fff : PCI Bus #01 2000-20ff : 0000:01:00.0 2000-20ff : radeonfb 4000-407f : motherboard 4000-4003 : PM1a_EVT_BLK 4008-400b : PM_TMR 4010-4015 : ACPI CPU throttle 4020-4023 : GPE0_BLK 8100-811f : motherboard 8100-8107 : viapro-smbus e680-e6ed : motherboard e6f2-e6f7 : motherboard fe00-fe00 : motherboard fe10-fe11 : motherboard fe10-fe11 : PM1a_CNT_BLK --------------000200040306070606030108 Content-Type: text/plain; name="lspci.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="lspci.txt" 0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8385 [K8T800 AGP] Host Bridge (rev 01) Subsystem: VIA Technologies, Inc. VT8385 [K8T800 AGP] Host Bridge Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- 0000:00:05.0 Network controller: Intel Corporation PRO/Wireless 2200BG (rev 05) Subsystem: Intel Corporation: Unknown device 2701 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- Reset+ 16bInt- PostWrite+ 16-bit legacy interface ports at 0001 0000:00:0e.0 Unknown mass storage controller: Winbond Electronics Corp: Unknown device 8481 (rev 01) Subsystem: Winbond Electronics Corp: Unknown device 1050 Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR-