From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751539AbXAUUNb (ORCPT ); Sun, 21 Jan 2007 15:13:31 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751577AbXAUUNa (ORCPT ); Sun, 21 Jan 2007 15:13:30 -0500 Received: from fmmailgate03.web.de ([217.72.192.234]:57817 "EHLO fmmailgate03.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751539AbXAUUNa convert rfc822-to-8bit (ORCPT ); Sun, 21 Jan 2007 15:13:30 -0500 From: Chr To: =?iso-8859-1?q?Bj=F6rn_Steinbrink?= Subject: Re: SATA exceptions with 2.6.20-rc5 Date: Sun, 21 Jan 2007 21:13:21 +0100 User-Agent: KMail/1.9.5 Cc: Robert Hancock , Jeff Garzik , Alistair John Strachan , linux-kernel@vger.kernel.org, htejun@gmail.com, jens.axboe@oracle.com, lwalton@real.com, chunkeey@web.de References: <200701211834.41306.chunkeey@web.de> <20070121180113.GC2441@atjola.homenet> In-Reply-To: <20070121180113.GC2441@atjola.homenet> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT Content-Disposition: inline Message-Id: <200701212113.22965.chunkeey@web.de> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sunday, 21. January 2007 19:01, Björn Steinbrink wrote: > On 2007.01.21 18:34:40 +0100, Chr wrote: > > I run those two in parallel: > while /bin/true; do ls -lR / > /dev/null 2>&1; done > while /bin/true; do echo 255 > /proc/sys/vm/drop_caches; sleep 1; done > > Not sure if running them in parallel is necessary, but I don't want to > change the test setup ;) Takes between 1 and 40 minutes to trigger it. > Most of the time it's around 15 minutes now, doing more random stuff in > addition to that seems to trigger it even easier (like reading mail, > rebuilding the kernel etc.). > > I'm down to 2 commits after 2.6.19 now, only bad kernels, so I tend to > say that 2.6.19 with 2.6.20-rc5's sata_nv.c will also fail for me, but I > thought I might finish bisection just to be sure. > > > But, this time it looks slightly different: > > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen > > ata3.00: tag 0 cmd 0xec Emask 0x4 stat 0x40 err 0x0 (timeout) > > > > [Rest of the error message + SMART error snipped] > > I get the same exception every time, doesn't change for me. And neither > do I get any SMART errors or something. > > Thanks, > Björn Ok, you won't believe this... I opened my case and rewired my drives... And guess what, my second (aka the "good") HDD is now failing! I guess, my mainboard has a (but maybe two, or three :( ) "bad" sata-port(s)! But, one small question remains: when I opened my case, I saw that my drivers are pluged in SATA jack 1 and 2... The BIOS also says they're on 1 and 2. Now, Linux says they're on port 3 & 4! it's always ata3.00! "ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata3.00: tag 0 cmd 0xea Emask 0x4 stat 0x40 err 0x0 (timeout) ata3: soft resetting port ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata3.00: configured for UDMA/133 ata3: EH complete SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: drive cache: write back" Thanks, Chr.