From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Darrick J. Wong" Subject: Re: 'Device not ready' issue on mpt2sas since 3.1.10 Date: Mon, 9 Jul 2012 16:45:13 -0400 Message-ID: <20120709204513.GD25664@kernel.stglabs.ibm.com> References: <4FE454CA.6080007@matthiasprager.de> <4FFAED4F.3080100@matthiasprager.de> <4FFB32E5.1050109@farcaster.org> Reply-To: djwong@us.ibm.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <4FFB32E5.1050109@farcaster.org> Sender: linux-raid-owner@vger.kernel.org To: Robert Trace Cc: Matthias Prager , linux-scsi@vger.kernel.org, linux-raid@vger.kernel.org List-Id: linux-raid.ids On Mon, Jul 09, 2012 at 03:37:09PM -0400, Robert Trace wrote: > > I did some further research regarding my problem. > > It appears to me the fault does not lie with the mpt2sas driver (not > > that I can definitely exclude it), but with the md implementation. > > I'm actually discovering some of the same issues (LSI 9211-8i w/ SATA > disks), but I've come to a slightly different conclusion. > > I noticed that when my SATA disks are on a SATA controller and they spin > down (or are spun down via hdparm -y), then they response to TUR (TEST > UNIT READY) commands with an OK. Any I/O sent to these disks simply > wait while the disks spin up and then complete as usual. > > However, my SATA disks on the SAS controller respond to TUR with the > sense error "Not Ready/Initializing command required". Any I/O sent to > these disks immediately fails. You saw this in your logging: > > > [ 604.838640] sd 2:0:0:0: [sda] Device not ready > > [ 604.838645] sd 2:0:0:0: [sda] Result: hostbyte=DID_OK > > driverbyte=DRIVER_SENSE > > [ 604.838655] sd 2:0:0:0: [sda] Sense Key : Not Ready [current] > > [ 604.838663] sd 2:0:0:0: [sda] Add. Sense: Logical unit not ready, > > initializing command required > > [ 604.838668] sd 2:0:0:0: [sda] CDB: Read(10): 28 00 00 00 08 00 00 00 > > 20 00 > > [ 604.838680] end_request: I/O error, dev sda, sector 2048 > > [ 604.838688] Buffer I/O error on device md127, logical block 0 > > [ 604.838695] Buffer I/O error on device md127, logical block 1 > > [ 604.838699] Buffer I/O error on device md127, logical block 2 > > [ 604.838702] Buffer I/O error on device md127, logical block 3 > > Sending an explicit START UNIT command to these sleeping disks will wake > them up and then they behave normally. (BTW, you can issue TURs and > START UNITs via the sg_turs and sg_start commands). > > I've reproduced this behavior on the raw disks themselves, no MD layer > involved (although the freak-out by my MD layer is what alerted me to > this issue too... Having your entire array punted the first time you > access it is a little scary :-). I'm also on raw hardware and I've seen > this behavior on kernels 3.0.33 through 3.4.4. > > So, SATA disks respond differently depending on the controller they're > on. I don't know if this is a SCSI thing, a SAS thing or a > firmware/driver thing for the 9211. I suspect that /sys/devices//manage_start_stop = 0 for the SATA devices hanging off the SAS controller. Setting that sysfs attribute to 1 is supposed to enable the SCSI layer to send TUR when it sees "LU not ready", as well as spin down the drives at suspend/poweroff time. --D > > Now, whether or not the MD layer should be assembling arrays from > "failed" disks is, I think, a separate issue. > > -- Rob > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >