From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932248AbWA0FH5 (ORCPT ); Fri, 27 Jan 2006 00:07:57 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932407AbWA0FH5 (ORCPT ); Fri, 27 Jan 2006 00:07:57 -0500 Received: from main.gmane.org ([80.91.229.2]:36808 "EHLO ciao.gmane.org") by vger.kernel.org with ESMTP id S932248AbWA0FH4 (ORCPT ); Fri, 27 Jan 2006 00:07:56 -0500 X-Injected-Via-Gmane: http://gmane.org/ To: linux-kernel@vger.kernel.org From: Kalin KOZHUHAROV Subject: libata errors in 2.6.15.1 ICH6 AHCI (SATA drive WD740GD) Date: Fri, 27 Jan 2006 14:07:41 +0900 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: s175249.ppp.asahi-net.or.jp User-Agent: Mail/News 1.5 (X11/20060115) X-Enigmail-Version: 0.94.0.0 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hi there. I am reiterating this, while trying to diagnose the problem. It is a DIY box with Asus P5GDC-V Deluxe motherboard with Marvel 88E8053 GB ethernet (for info see [1]) and WD740GD (10k RPM) harddisk. The NIC was not found by the in kernel driver, so I used a patch to sk98lin binary driver, later tried sky2; both with intermittent succes. Now I have a r8169 NIC and have disabled on board one in BIOS and put a new vanilla linux-2.6.5.1 After some time (30 minutes to 3 days) the machine dies, first the disk, some partitions mounted RO by the kernel, finally everything is dead (no response to ping and KBD). What I get in the dmesg is this: ... [ 23.464209] hub 5-0:1.0: USB hub found [ 23.464221] hub 5-0:1.0: 8 ports detected [ 25.819331] r8169: eth0: link up [13091.397797] ata1: handling error/timeout [13091.397805] ata1: port reset, p_is 0 is 0 pis 0 cmd 4017 tf d0 ss 113 se 0 [13091.397823] ata1: status=0x50 { DriveReady SeekComplete } [13091.397828] sda: Current: sense key=0x0 [13091.397831] ASC=0x0 ASCQ=0x0 [13091.481534] ata1: port reset, p_is 40000001 is 1 pis 0 cmd 4017 tf 471 ss 113 se 0 [13091.481542] ata1: translated ATA stat/err 0x71/04 to SCSI SK/ASC/ASCQ 0xb/00/00 [13091.481544] ata1: status=0x71 { DriveReady DeviceFault SeekComplete Error } [13091.481549] ata1: error=0x04 { DriveStatusError } ... The full dmesg can be found under [1] as 2.6.15.1-K01_P4_server.3.dmesg I checked the drive (on the same machine) both with smartctl and with the boot floppy I downloaded from WD support site (Data lifeguard tools). Neither reported anything bad (yes I looked the status after the test). The filesystem (reiserfs) does fscheck on every bood, but so far corruption has not occured as far as I can see. As always, the usual question is: What is the cause of this? Bug? What can I do to better diagnose it? Is any additional info helpful (see [1])? Dmesg and other hardware info can be found here: [1]: http://linux.tar.bz/reports/oopses/char/ Kalin. -- |[ ~~~~~~~~~~~~~~~~~~~~~~ ]| +-> http://ThinRope.net/ <-+ |[ ______________________ ]|