From mboxrd@z Thu Jan 1 00:00:00 1970 From: Raman Gupta Subject: Marvell exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen (was Re: [PATCH 0/3] AHCI updates: Marvell AHCI PATA works; pata_marvell fate?) Date: Sat, 02 Jan 2010 03:44:50 -0500 Message-ID: <4B3F0782.4040207@fastmail.fm> References: <20090417023949.GA14469@havoc.gtf.org> <4B37D713.4070407@gmail.com> <4B37FB36.1040603@fastmail.fm> <4B3B8B0B.2080805@seoss.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7BIT Return-path: Received: from out1.smtp.messagingengine.com ([66.111.4.25]:34653 "EHLO out1.smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752125Ab0ABIow (ORCPT ); Sat, 2 Jan 2010 03:44:52 -0500 In-Reply-To: <4B3B8B0B.2080805@seoss.co.uk> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Tim Small Cc: Robert Hancock , linux-ide@vger.kernel.org On 12/30/2009 12:16 PM, Tim Small wrote: > Raman Gupta wrote: >> However, note that I can make the "exception Emask 0x0 SAct 0x0 SErr >> 0x0 action 0x6 frozen" happen even with the RAID array stopped and no >> filesystems mounted. All I have to do is run the smartctl -a /dev/sdd >> command (sdd is attached to the Marvell controller) repeatedly until >> this exception occurs: >> >> Dec 27 18:59:30 x kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr >> 0x0 action 0x6 frozen >> Dec 27 18:59:30 x kernel: ata6.00: cmd >> ec/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in >> Dec 27 18:59:30 x kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/40 >> Emask 0x4 (timeout) >> Dec 27 18:59:30 x kernel: ata6.00: status: { DRDY } >> Dec 27 18:59:30 x kernel: ata6: hard resetting link >> Dec 27 18:59:30 x kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 >> SControl 300) >> Dec 27 18:59:30 x kernel: ata6.00: configured for UDMA/133 >> Dec 27 18:59:30 x kernel: ata6: EH complete >> >> Usually 10-15 executions is sufficient to replicate the issue. > > Hmm. I wonder what running this script from this bug: > > http://bugzilla.kernel.org/show_bug.cgi?id=14831 > > against drives attached to other controllers would do? It doesn't do > anything particularly special - just runs smartctl in a loop while also > writing to the same drive (via fs using dd). Against a Seagate ST3500418AS on the Marvell controller, the script produced the first "smartctl failed" error in 55 seconds. Within about 8 minutes, everything went to pot and all drives on that controller were completely inaccessible (all filesystem writes failed and the kernel could not IDENTIFY the drives). As far as I can tell with my multimeter, voltages were stable. > Out of interest have you tried drives from other manufacturers? Unfortunately, at the moment I don't have any non-Seagate drives available. > Would also be interested to see what happens if you run the script > against the same drive, but attached to the ICH7? The problem occurs against any of the three Seagate ST3500418AS drives I have attached to the Marvell. Against the same model of drive attached to my ICH7 controller, I canceled the script after it ran for 1.5 hours without any problems. So the problem appears to be exclusive to the Marvell -- either the hardware or the driver. Furthermore, over the last few days, I've had smartd and hddtemp turned off for the Marvell drives, and they have been stable and error-free. Cheers, Raman