From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-1?Q?Markus_M=FCller?= Subject: Re: sata_sil: write corruption on parallel access of two or more drives on same controller Date: Thu, 20 Apr 2006 09:07:13 +0200 Message-ID: <44473321.5010705@priv.de> References: <4446BE3C.7050902@priv.de> <20060420022527.GA3697@htj.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail.priv.de ([80.237.225.190]:31138 "EHLO mail.priv.de") by vger.kernel.org with ESMTP id S1750715AbWDTHHQ (ORCPT ); Thu, 20 Apr 2006 03:07:16 -0400 In-Reply-To: <20060420022527.GA3697@htj.dyndns.org> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Tejun Heo Cc: linux-ide@vger.kernel.org Hi Tejun Heo, > On Thu, Apr 20, 2006 at 12:48:28AM +0200, Markus M?ller wrote: > >> Hi sil_sata.c-Developers, >> >> I've a problem accessing discs on my SIL 3114 controller: If I write >> to it and if during this any other access (= read or write) to a disc >> on same controller occures, there are write errors. >> >> The kernel doesn't realise this at all, there is no message about >> that in dmesg or syslog. >> >> > [--snip--] > >> This problem doesn't occure with this sil controllers and sata >> hdds on a Neo2 Board with AMD64 from MSI so... >> >> -> Maybe the SIL-Driver isn't useable with the NForce2 Chipset?! >> > > This sounds like something is going wrong on the host bus. > > >> Please inculde me in answers as CC, cause I am currently >> not on the kernel mailing list. >> > > I used to do the same but you don't have to ask for cc'ing. It's the > way things are done here. People are not supposed to trim cc-list > unless there are specific reasons. > > Can you try the following patch? Be careful, I've only compile-tested > it. > > [--snip--] The problem does still occure same with the following patch installed. There are still no messages in dmesg. What can I further do? Thanks for any help! I have no problem to install futher test patches, my data on raid are all safed, so it doesn't matter what happens at all on this system, as long as it don't work cause of this problem. stacker:/usr/src# diff -u10 linux-2.6.16.9/drivers/scsi/libata-core.c linux-2.6.16.9.new/drivers/scsi/libata-core.c --- linux-2.6.16.9/drivers/scsi/libata-core.c 2006-04-19 08:10:14.000000000 +0200 +++ linux-2.6.16.9.new/drivers/scsi/libata-core.c 2002-01-22 08:47:57.000000000 +0100 @@ -4051,20 +4051,27 @@ host_stat = ap->ops->bmdma_status(ap); VPRINTK("ata%u: host_stat 0x%X\n", ap->id, host_stat); /* if it's not our irq... */ if (!(host_stat & ATA_DMA_INTR)) goto idle_irq; /* before we do anything else, clear DMA-Start bit */ ap->ops->bmdma_stop(qc); + /* check host bus error */ + if (host_stat & ATA_DMA_ERR) { + printk(KERN_ERR "ata%u: BMDMA host bus error\n", + ap->id); + qc->err_mask |= AC_ERR_HOST_BUS; + } + /* fall through */ case ATA_PROT_ATAPI_NODATA: case ATA_PROT_NODATA: /* check altstatus */ status = ata_altstatus(ap); if (status & ATA_BUSY) goto idle_irq; /* check main status, clearing INTRQ */ stacker:/usr/src# My test was: stacker:/var/log# badblocks /dev/sda & [1] 1249 stacker:/var/log# badblocks -n /dev/sdb 123 382 576 616 1217 1255 2645 3664 Interrupt caught, cleaning up stacker:/var/log# dmesg|tail ReiserFS: loop0: replayed 15 transactions in 0 seconds ReiserFS: loop0: Using r5 hash to sort names eth0: Promiscuous mode enabled. device eth0 entered promiscuous mode eth0: Promiscuous mode enabled. eth0: Promiscuous mode enabled. eth0: Promiscuous mode enabled. br0: port 1(eth0) entering learning state br0: topology change detected, propagating br0: port 1(eth0) entering forwarding state stacker:/var/log# gReeTings, Markus Mueller