From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sander Subject: Re: [PATCH] 2.6.xx: sata_mv: another critical fix Date: Wed, 22 Mar 2006 10:00:06 +0100 Message-ID: <20060322090006.GA8462@favonius> References: <20060321121354.GB24977@favonius> <442004E4.7010002@rtr.ca> <20060321153708.GA11703@favonius> <20060321191547.GC20426@favonius> <20060321204435.GE25066@favonius> <44206B81.1030309@garzik.org> Reply-To: sander@humilis.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from ookhoi.xs4all.nl ([213.84.114.66]:22700 "EHLO favonius.humilis.net") by vger.kernel.org with ESMTP id S1751128AbWCVJAJ (ORCPT ); Wed, 22 Mar 2006 04:00:09 -0500 Content-Disposition: inline In-Reply-To: Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Linus Torvalds Cc: Jeff Garzik , Sander , Mark Lord , Mark Lord , Andrew Morton , "linux-ide@vger.kernel.org" , Linux Kernel Linus Torvalds wrote (ao): > On Tue, 21 Mar 2006, Jeff Garzik wrote: > > In any case, one could be lazy, and simply bisect the main tree > > (and/or simply verify that the problem is gone in > > 2.6.16-git). > > Yes, just testing the current git tree (and if you're not a git user, > just waiting for the next nightly snapshot) sounds like the > appropriate thing to do. The 2.6.16-git3 snapshot is stable for me like -rc6-mm1 and -rc6-mm2 are :-) To recap: Running eight Maxtor disks connected to a MV88SX6081 8-port SATA II PCI-X Controller (rev 09) mdadm -C -l5 -n8 /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 \ /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 mke2fs -j -m1 /dev/md0 mount -o data=writeback,nobh /dev/md0 /mnt for i in `seq 4` do dd if=/dev/zero of=bigfile.$i bs=1024k count=10000 dd if=bigfile.$i of=/dev/null bs=1024k count=10000 done time md5sum bigfile.* time rm bigfile.* or for i in `seq 4` do ( dd if=/dev/zero of=bigfile.$i bs=1024k count=10000 ; \ time md5sum bigfile.$i ) & done Kernel 2.6.16-rc6 and 2.6.16 always crash during the md5sum (and leave no output). 2.6.16-rc6-mm1, 2.6.16-rc6-mm2 and 2.6.16-git3 are stable without a crash or data corruption. I'm aware that the test is very simple and bugs might still hide. I'll go and find some more serious stress tests. Should I do more testing/bisecting/etc? Btw, I do still get these (any kernel), but with no visible effect: [ 2306.952183] ata6: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00 [ 2306.952246] ata6: status=0xd0 { Busy } [ 2891.892225] ata5: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00 [ 2891.892277] ata5: status=0xd0 { Busy } [ 4550.013582] ata6: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00 [ 4550.013637] ata6: status=0xd0 { Busy } [ 4864.850340] ata9: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00 [ 4864.850393] ata9: status=0xd0 { Busy } [ 4968.681651] ata9: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00 [ 4968.681711] ata9: status=0xd0 { Busy } Thanks! Sander -- Humilis IT Services and Solutions http://www.humilis.net