From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Rabbitson Subject: Help to decipher kernel io error log Date: Thu, 28 Aug 2008 12:03:07 +0200 Message-ID: <48B677DB.4010306@rabbit.us> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids Greetings, This is not a strictly raid question, but this is the best list I know of for this type of questions. Two days ago my server ground to a halt without apparent reasons. There were tons of processes in D state, with no signs of any significant work being done. I attributed it to resource starvation (the server is pretty loaded), rebooted and went on with my life. Yesterday I received the log messages included at the bottom of this email. Since I am running a --level=10 --raid-devices=4 --layout=f3 I am not that worried abiut losing data, and decided to investigate. I removed (mdadm -r) the devices in question from the arrays, power cycled the server, and executed a full badblocks -svw /dev/sda run. It passed with flying colors. So here is my question - what does the log below signify (there are no omissions, this is all I got) - is my controller dying? Or is there indeed a well masked hard drive failure? Should I change the drive, the controller, or both? Thank you for your thoughts! Peter ==================== === Hardware setup Intel SE7210 TP1-E board (http://www.intel.com/support/motherboards/server/se7210tp1-e/index.htm) 4 identical 250GB Maxtor 7Y250M0 hard drives - two of them attached to the on board SATA controller: 00:1f.2 IDE interface: Intel Corporation 6300ESB SATA Storage Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO]) Subsystem: Intel Corporation Device 342f Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR-