sata_promise SATA300TX4 "intermittent problems"

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* sata_promise SATA300TX4 "intermittent problems"
@ 2007-03-07 14:32 Peter Favrholdt
  2007-03-07 20:12 ` Mikael Pettersson
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Favrholdt @ 2007-03-07 14:32 UTC (permalink / raw)
  To: linux-ide

Hi,

I've seen "intermittent problems" with Promise SATA300 TX4 controllers
and Linux kernel 2.6.19 (through 2.6.20-rc2 with some additional
patches).

Sometimes the TX4 will loose a port - a reboot brings the drive back up 
again. I'm quite sure the harddrives are not at fault.

I have experienced this using "plain vanilla" Linux 2.6.19.2 and 
2.6.20.1. Today I have tested using Linux 2.6.21-rc2 with Mikael 
Petterson's patches (more on that further down).

Yesterday (using 2.6.20.1) I could fail two out of four drives by doing:
dd if=/dev/sda of=/dev/null bs=1M &
dd if=/dev/sdb of=/dev/null bs=1M &
dd if=/dev/sdc of=/dev/null bs=1M &
dd if=/dev/sdd of=/dev/null bs=1M &

sdd would fail first then after a while sdc, here is the dmesg output 
when sdd failed:

[14895.092650] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 
action 0x2 frozen
[14895.092664] ata4.00: cmd 25/00:00:00:3e:1a/00:02:05:00:00/e0 tag 0 
cdb 0x0 data 262144 in
[14895.092666]          res 40/00:01:09:4f:c2/00:00:00:00:00/00 Emask 
0x4 (timeout)
[14895.404597] ata4: soft resetting port
[14895.560511] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[14925.555206] ata4.00: qc timeout (cmd 0xec)
[14925.555437] ata4.00: failed to IDENTIFY (I/O error, err_mask=0x104)
[14925.555441] ata4.00: revalidation failed (errno=-5)
[14925.555452] ata4: failed to recover some devices, retrying in 5 secs
[14930.556912] ata4: hard resetting port
[14930.876763] ata4: COMRESET failed (device not ready)
[14930.876772] ata4: hardreset failed, retrying in 5 secs
[14935.878525] ata4: hard resetting port
[14936.198407] ata4: COMRESET failed (device not ready)
[14936.198416] ata4: hardreset failed, retrying in 5 secs
[14941.200169] ata4: hard resetting port
[14941.520051] ata4: COMRESET failed (device not ready)
[14941.520060] ata4: reset failed, giving up
[14941.520063] ata4.00: disabled
[14941.520075] ata4: EH complete
[14941.520567] sd 4:0:0:0: SCSI error: return code = 0x00040000
[14941.520572] end_request: I/O error, dev sdd, sector 85605888
[14941.520577] Buffer I/O error on device sdd, logical block 10700736
[14941.520582] Buffer I/O error on device sdd, logical block 10700737

After a reboot the drives are operating again. But with an entry in the 
SMART log, e.g.:

Error 6 occurred at disk power-on lifetime: 353 hours (14 days + 17 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   84 51 ef 11 3e 1a e0  Error: ICRC, ABRT 239 sectors at LBA = 
0x001a3e11 = 1719825

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   25 00 00 00 3e 1a e0 00      04:08:17.774  READ DMA EXT
   25 00 00 00 3c 1a e0 00      04:08:17.764  READ DMA EXT
   25 00 00 00 3a 1a e0 00      04:08:17.753  READ DMA EXT
   25 00 00 00 38 1a e0 00      04:08:17.743  READ DMA EXT
   25 00 00 00 36 1a e0 00      04:08:17.734  READ DMA EXT


Today I have tested using Linux 2.6.21-rc2 with Mikael Petterson's
patches. In order to make it build I had to disable local-apic. So far
it seems to work better, but doing

dd if=/dev/sda of=/dev/null bs=1M &
dd if=/dev/sdb of=/dev/null bs=1M &
dd if=/dev/sdc of=/dev/null bs=1M &
dd if=/dev/sdd of=/dev/null bs=1M &

and then a couple of times:

for each in /dev/sd[abcd]; do smartctl -d ata -a $each | awk 
'/194/{print $10}'; done

will trig the error again:

[52849.930755] pdc_error_intr: port_status 0x00001000 serror 0x00000000
[52849.930880] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 
frozen
[52849.930883] ata2.00: (port_status 0x00001000)
[52849.930892] ata2.00: cmd 25/00:00:00:f7:1e/00:02:1b:00:00/e0 tag 0 
cdb 0x0 data 262144 in
[52849.930894]          res 50/00:00:ff:f8:1e/00:00:ff:59:c8/e0 Emask 
0x4 (timeout)
[52850.241962] ata2: soft resetting port
[52850.397984] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[52850.424344] pdc_error_intr: port_status 0x00001000 serror 0x00000000
[52850.424639] ata2.00: failed to set xfermode (err_mask=0x104)
[52850.424643] ata2: failed to recover some devices, retrying in 5 secs
[52855.423576] ata2: hard resetting port
[52855.899453] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[52855.933438] ata2.00: configured for UDMA/133
[52855.933456] ata2: EH complete
[52855.973979] SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
[52856.022739] sdb: Write Protect is off
[52856.022747] sdb: Mode Sense: 00 3a 00 00
[52856.085241] SCSI device sdb: write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[52856.089287] SCSI device sdb: 976773168 512-byte hdwr sectors (500108 MB)
[52856.092552] sdb: Write Protect is off
[52856.092560] sdb: Mode Sense: 00 3a 00 00
[52856.099067] SCSI device sdb: write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

although this time the hard reset is working, and the port comes back
up and continues reading. This is of course much better because a raid
device would not fail. But I still think the reset should not be
necessary?

I wonder if the earlier problems I've seen has been due to my own poking 
around with smartctl during heavy load. I'll try to test this some more.

I would be very happy to help debug this issue. Any suggestions on what 
I should try next?

Some background info:

I have three systems with SATA300TX4s:

System 1 (can be used for testing):
Linux 2.6.21-rc2+Mikael_Petterson
AMD Athlon(tm) XP 2500+ on a Nvidia nForce2 motherboard.
4 harddrives all connected to the TX4 in a normal PCI slot 133MHz
Seagate ST3500630NS (Barracuda 500GB ES) Firmware 3.AEE

System 2 (production system)
Dell PowerEdge 2800
Linux 2.6.19.5
Identical harddrives all connected to TX4 in a PCI-X slot 266MHz.

System 3 (production backup):
Linux 2.6.15
Identical to System 2 except only two disks. These are Barracuda 500GB
(non ES version).

Best regards,

Peter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sata_promise SATA300TX4 "intermittent problems"
  2007-03-07 14:32 sata_promise SATA300TX4 "intermittent problems" Peter Favrholdt
@ 2007-03-07 20:12 ` Mikael Pettersson
  2007-03-08 16:26   ` Peter Favrholdt
  0 siblings, 1 reply; 7+ messages in thread
From: Mikael Pettersson @ 2007-03-07 20:12 UTC (permalink / raw)
  To: Peter Favrholdt; +Cc: linux-ide

Peter Favrholdt writes:
 > Hi,
 > 
 > I've seen "intermittent problems" with Promise SATA300 TX4 controllers
 > and Linux kernel 2.6.19 (through 2.6.20-rc2 with some additional
 > patches).
 > 
 > Sometimes the TX4 will loose a port - a reboot brings the drive back up 
 > again. I'm quite sure the harddrives are not at fault.
 > 
 > I have experienced this using "plain vanilla" Linux 2.6.19.2 and 
 > 2.6.20.1. Today I have tested using Linux 2.6.21-rc2 with Mikael 
 > Petterson's patches (more on that further down).
 > 
 > Yesterday (using 2.6.20.1) I could fail two out of four drives by doing:
 > dd if=/dev/sda of=/dev/null bs=1M &
 > dd if=/dev/sdb of=/dev/null bs=1M &
 > dd if=/dev/sdc of=/dev/null bs=1M &
 > dd if=/dev/sdd of=/dev/null bs=1M &
 > 
 > sdd would fail first then after a while sdc, here is the dmesg output 
 > when sdd failed:
 > 
 > [14895.092650] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 
 > action 0x2 frozen

SErr 0x01380000 would indicate:
transport state transmission error (bit 24)
CRC error (bit 21)
disparity error (bit 20) [whatever that is]
10b_to_8b decoding error (bit 19)

I.e., serious transmission issues.

 > [52849.930755] pdc_error_intr: port_status 0x00001000 serror 0x00000000
 > [52849.930880] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 
 > frozen
 > [52849.930883] ata2.00: (port_status 0x00001000)

"host bus timeout error" (bit 12).
I wonder why SError was clear now.

 > I would be very happy to help debug this issue. Any suggestions on what 
 > I should try next?

Well, at the moment I have only one possible cure: to forcibly
limit 3Gbps drives to 1.5Gbps operation, as the patch below does.

On one of my test machines (an old UltraSPARC), a SATA300 TX2plus
with a Seagate 3Gbps drive (don't have the model number handy),
will quickly experience "DMA S/G overrun" errors during an fsck
of a large but clean ext3 partition. With the patch below things
work solidly on that particular machine. OTOH, on another test
machine (a 440BX chipset Intel PIII), the same card/cable/disk
combination works flawlessly at 3Gbps. Mysterious.

/Mikael

--- linux-2.6.21-rc2/drivers/ata/sata_promise.c.~1~	2007-03-06 22:17:21.000000000 +0100
+++ linux-2.6.21-rc2/drivers/ata/sata_promise.c	2007-03-06 23:21:36.000000000 +0100
@@ -378,6 +378,18 @@ static int pdc_port_start(struct ata_por
 		writel(tmp, mmio + 0x014);
 	}
 
+	/* hack SControl to limit speed to 1.5Gbps */
+	if ((hp->flags & PDC_FLAG_GEN_II) && sata_scr_valid(ap)) {
+		void __iomem *mmio = (void __iomem *) ap->ioaddr.scr_addr;
+		unsigned int tmp1, tmp2;
+
+		tmp1 = readl(mmio + 0x008);
+		tmp2 = (tmp1 & 0xffffff00) | 0x00000011;
+		writel(tmp2, mmio + 0x008);
+		readl(mmio + 0x008); /* flush */
+		printk("%s(port %u): adjusted SControl from 0x%08x to 0x%08x\n", __FUNCTION__, ap->port_no, tmp1, tmp2);
+	}
+
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sata_promise SATA300TX4 "intermittent problems"
  2007-03-07 20:12 ` Mikael Pettersson
@ 2007-03-08 16:26   ` Peter Favrholdt
  2007-03-09  6:27     ` Peter Favrholdt
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Favrholdt @ 2007-03-08 16:26 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-ide

Hi Mikael,

Thanks for the reply, I've commented below:

Mikael Pettersson wrote:
> SErr 0x01380000 would indicate:
> transport state transmission error (bit 24)
> CRC error (bit 21)
> disparity error (bit 20) [whatever that is]
> 10b_to_8b decoding error (bit 19)
> 
> I.e., serious transmission issues.

:-)

> > [52849.930755] pdc_error_intr: port_status 0x00001000 serror 0x00000000
> > [52849.930880] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 
> > frozen
> > [52849.930883] ata2.00: (port_status 0x00001000)
> 
> "host bus timeout error" (bit 12).
> I wonder why SError was clear now.

I can't say - this whole ata thing is much too complex for me ;-)

> > I would be very happy to help debug this issue. Any suggestions on what 
> > I should try next?
> 
> Well, at the moment I have only one possible cure: to forcibly
> limit 3Gbps drives to 1.5Gbps operation, as the patch below does.

I haven't tried your 1.5Gbps patch (yet). But I have been running more 
tests on my experiment system with the kernels I have handy. My 
procedure is as follows:

1. power cycle
2. boot selected kernel
3. start dd if=/dev/sdx of=/dev/null bs=1M for x=a,b,c,d
4. wait until one fails
5. record dmesg output

So far here are my results:

2.6.18.1 fails (in 25 minutes)
2.6.19   fails (in 4 minutes)
2.6.19.2 fails (in 5 minutes)
2.6.20.1 fails (in 48 minutes)
2.6.21-rc2+p (with additional patches) doesn't fail

This is very consistent. 2.6.21-rc2+p has been tested for more than 10 
hours without a hickup :-)

In the above tests it is always ata3 or ata4 (sdc or sdd) which fails.

Another strange thing which happens on 2.6.21-rc2+p but not the other 
kernels: using smartctl -a -d ata while dd is running gives errors (I 
also mentioned this in my first mail, but wasn't sure then):

[11046.005178] pdc_error_intr: port_status 0x00001000 serror 0x00000000
[11046.005286] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 
frozen
[11046.005374] ata4.00: (port_status 0x00001000)
[11046.005383] ata4.00: cmd 25/00:00:00:3b:a0/00:01:27:00:00/e0 tag 0 
cdb 0x0 data 131072 in
[11046.005385]          res 50/00:00:ff:3b:a0/00:00:00:00:00/e0 Emask 
0x4 (timeout)
[11046.313769] ata4: soft resetting port
[11046.469806] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11046.496254] pdc_error_intr: port_status 0x00001000 serror 0x00000000
[11046.496580] ata4.00: failed to set xfermode (err_mask=0x104)
[11046.496585] ata4: failed to recover some devices, retrying in 5 secs
[11051.495393] ata4: hard resetting port
[11051.971276] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11052.005267] ata4.00: configured for UDMA/133
[11052.005285] ata4: EH complete
[11052.042615] SCSI device sdd: 976773168 512-byte hdwr sectors (500108 MB)
[11052.051769] sdd: Write Protect is off
[11052.051778] sdd: Mode Sense: 00 3a 00 00
[11052.059455] SCSI device sdd: write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[11052.066354] SCSI device sdd: 976773168 512-byte hdwr sectors (500108 MB)
[11052.070822] sdd: Write Protect is off
[11052.070830] sdd: Mode Sense: 00 3a 00 00
[11052.073297] SCSI device sdd: write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

Then it recovers and dd continues :-)

Note that using smartctl this way on the other kernels does not show 
this problem!

> On one of my test machines (an old UltraSPARC), a SATA300 TX2plus
> with a Seagate 3Gbps drive (don't have the model number handy),
> will quickly experience "DMA S/G overrun" errors during an fsck
> of a large but clean ext3 partition. With the patch below things
> work solidly on that particular machine. OTOH, on another test
> machine (a 440BX chipset Intel PIII), the same card/cable/disk
> combination works flawlessly at 3Gbps. Mysterious.

My feeling is this is not caused by 1.5Gbps or 3.0Gbps operation.

I was thinking about adding the speed selections jumpers on the 
harddrives, but so far I'm not touching the system as I don't want 
hardware problems (e.g. a loose cable) disturbing the test results. I'll 
stick to replacing software.

My next test will be a plain 2.6.21rc2. Then I'll apply the patches one 
by one.

One thought is this could be a bug/race condition which only shows under 
certain lucky circumstances - maybe the robustness of 2.6.21-rc2+p is 
due to local-apic not being enabled or some other subtle kernel build thing?

Any suggestion on what I could do to help track this down is much 
appreciated?

Best regards,

Peter Favrholdt


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sata_promise SATA300TX4 "intermittent problems"
  2007-03-08 16:26   ` Peter Favrholdt
@ 2007-03-09  6:27     ` Peter Favrholdt
  2007-03-09  7:01       ` Tomi Orava
  2007-03-13  7:11       ` Tomi Orava
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Favrholdt @ 2007-03-09  6:27 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: linux-ide

Hi again,

Peter Favrholdt wrote:
> My feeling is this is not caused by 1.5Gbps or 3.0Gbps operation.
> <...snip>
> My next test will be a plain 2.6.21rc2. Then I'll apply the patches one 
> by one.

I've tested 2.6.21-rc2 which fails (sdc down after 27 minutes & sdd down 
after 46 minutes).

Then I applied just a single patch to 2.6.21-rc2: Mikael Petterssons 
patch to force 1.5Gbps operation and tested again - this time no 
problems at all!

(BTW: both kernels are running with IO-APIC disabled).

I've put results+dmesg output here: http://sata300tx4.gratiswiki.dk/

Best regards,

Peter

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sata_promise SATA300TX4 "intermittent problems"
  2007-03-09  6:27     ` Peter Favrholdt
@ 2007-03-09  7:01       ` Tomi Orava
  2007-03-09  7:29         ` Peter Favrholdt
  2007-03-13  7:11       ` Tomi Orava
  1 sibling, 1 reply; 7+ messages in thread
From: Tomi Orava @ 2007-03-09  7:01 UTC (permalink / raw)
  To: linux-ide


Hi,

> Peter Favrholdt wrote:
>> My feeling is this is not caused by 1.5Gbps or 3.0Gbps operation.
>> <...snip>
>> My next test will be a plain 2.6.21rc2. Then I'll apply the patches one
>> by one.
>
> I've tested 2.6.21-rc2 which fails (sdc down after 27 minutes & sdd down
> after 46 minutes).

I've now been running with 2.6.21-rc2-git1 + Mikaels original "patch
bundle" for 7 days without hangs! The machine has 2 Seagate 7200.7 disks
and 2 Seagate 7200.10 (in 3.0Gps mode) with Promise Sata300TX4. The system
does spit the following messages whenever there is load (no hickups user
noticiable hickups however):

Mar  4 03:51:13 alderan kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x0
Mar  4 03:51:13 alderan kernel: ata4.00: cmd
25/00:00:3f:0e:a8/00:04:09:00:00/e0 tag 0 cdb 0x0 data 524288 in
Mar  4 03:51:13 alderan kernel:          res
50/00:00:3e:12:a8/00:00:09:00:00/e0 Emask 0x1 (device error)
Mar  4 03:51:13 alderan kernel: ata4.00: configured for UDMA/133
Mar  4 03:51:13 alderan kernel: ata4: EH complete
Mar  4 03:51:13 alderan kernel: SCSI device sdd: 976773168 512-byte hdwr
sectors (500108 MB)
Mar  4 03:51:13 alderan kernel: sdd: Write Protect is off
Mar  4 03:51:13 alderan kernel: SCSI device sdd: write cache: enabled,
read cache: enabled, doesn't support DPO or FUA


>
> Then I applied just a single patch to 2.6.21-rc2: Mikael Petterssons
> patch to force 1.5Gbps operation and tested again - this time no
> problems at all!
>
> (BTW: both kernels are running with IO-APIC disabled).

I have io-apic enabled, Asus A7V880 (Via KT880-chipset).
In the past, when ever one of the disks failed, it _never_ was
the first disk on the linux system (although I don't remember if it was
connected to promise-cards first port as the ports are still numbered in
some very pecualiar way under linux).

Regards,
Tomi Orava

-- 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sata_promise SATA300TX4 "intermittent problems"
  2007-03-09  7:01       ` Tomi Orava
@ 2007-03-09  7:29         ` Peter Favrholdt
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Favrholdt @ 2007-03-09  7:29 UTC (permalink / raw)
  To: Tomi Orava; +Cc: linux-ide

Hi Tomi,

My experiences are in accordance with yours:

1) it doesn't matter if IO-APIC is enabled or not.

2) using Mikaels "patch bundle" the channel may die then recovers ok. 
(This can be triggered by using smartctl when loaded).

3) It is never the first port that dies. It seems to be always number 3 
or 4.

If you dare, then try using smartctl -d ata -a on your disks and observe 
dmesg output :-) Every now and then smartctl will "kill the channel" 
which will then recover ok.

However, smartctl does not show this behaviour using just the 1.5Gbps 
patch. Running at 1.5Gbps seems to fix the problem - no hickups whatsoever.

Best regards,

Peter

Tomi Orava wrote:
> I've now been running with 2.6.21-rc2-git1 + Mikaels original "patch
> bundle" for 7 days without hangs! The machine has 2 Seagate 7200.7 disks
> and 2 Seagate 7200.10 (in 3.0Gps mode) with Promise Sata300TX4. The system
> does spit the following messages whenever there is load (no hickups user
> noticiable hickups however):
> 
> Mar  4 03:51:13 alderan kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr
> 0x0 action 0x0
> Mar  4 03:51:13 alderan kernel: ata4.00: cmd
> 25/00:00:3f:0e:a8/00:04:09:00:00/e0 tag 0 cdb 0x0 data 524288 in
> Mar  4 03:51:13 alderan kernel:          res
> 50/00:00:3e:12:a8/00:00:09:00:00/e0 Emask 0x1 (device error)
> Mar  4 03:51:13 alderan kernel: ata4.00: configured for UDMA/133
> Mar  4 03:51:13 alderan kernel: ata4: EH complete
> Mar  4 03:51:13 alderan kernel: SCSI device sdd: 976773168 512-byte hdwr
> sectors (500108 MB)
> Mar  4 03:51:13 alderan kernel: sdd: Write Protect is off
> Mar  4 03:51:13 alderan kernel: SCSI device sdd: write cache: enabled,
> read cache: enabled, doesn't support DPO or FUA
> 
> I have io-apic enabled, Asus A7V880 (Via KT880-chipset).
> In the past, when ever one of the disks failed, it _never_ was
> the first disk on the linux system (although I don't remember if it was
> connected to promise-cards first port as the ports are still numbered in
> some very pecualiar way under linux).
> 
> Regards,
> Tomi Orava


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sata_promise SATA300TX4 "intermittent problems"
  2007-03-09  6:27     ` Peter Favrholdt
  2007-03-09  7:01       ` Tomi Orava
@ 2007-03-13  7:11       ` Tomi Orava
  1 sibling, 0 replies; 7+ messages in thread
From: Tomi Orava @ 2007-03-13  7:11 UTC (permalink / raw)
  To: Peter Favrholdt; +Cc: Mikael Pettersson, linux-ide


Hello,

> Peter Favrholdt wrote:
>> My feeling is this is not caused by 1.5Gbps or 3.0Gbps operation.
>> <...snip>
>> My next test will be a plain 2.6.21rc2. Then I'll apply the patches one
>> by one.
>
> I've tested 2.6.21-rc2 which fails (sdc down after 27 minutes & sdd down
> after 46 minutes).
>
> Then I applied just a single patch to 2.6.21-rc2: Mikael Petterssons
> patch to force 1.5Gbps operation and tested again - this time no
> problems at all!
>
> (BTW: both kernels are running with IO-APIC disabled).
>
> I've put results+dmesg output here: http://sata300tx4.gratiswiki.dk/

I had the oppoturnity to test Mikael's 1,5Gbps patch yesterday evening and
although the system seems to run OK, I still do get the following system
log messages:


Mar 13 06:10:22 alderan kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x0
Mar 13 06:10:22 alderan kernel: ata2.00: cmd
c8/00:30:0f:e2:86/00:00:00:00:00/e7 tag 0 cdb 0x0 data 24576 in
Mar 13 06:10:22 alderan kernel:          res
50/00:00:3e:e2:86/00:00:00:00:00/e7 Emask 0x1 (device error)
Mar 13 06:10:22 alderan kernel: ata2.00: configured for UDMA/133
Mar 13 06:10:22 alderan kernel: ata2: EH complete
Mar 13 06:10:22 alderan kernel: SCSI device sdb: 976773168 512-byte hdwr
sectors (500108 MB)
Mar 13 06:10:22 alderan kernel: sdb: Write Protect is off
Mar 13 06:10:22 alderan kernel: SCSI device sdb: write cache: enabled,
read cache: enabled, doesn't support DPO or FUA
Mar 13 06:11:23 alderan kernel: possible SYN flooding on port 52223.
Sending cookies.
Mar 13 06:13:05 alderan kernel: possible SYN flooding on port 52223.
Sending cookies.
Mar 13 06:13:23 alderan kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr
0x0 action 0x0
Mar 13 06:13:23 alderan kernel: ata2.00: cmd
25/00:00:27:29:73/00:02:07:00:00/e0 tag 0 cdb 0x0 data 262144 in
Mar 13 06:13:23 alderan kernel:          res
50/00:00:26:2b:73/00:00:00:00:00/e0 Emask 0x1 (device error)
Mar 13 06:13:23 alderan kernel: ata2.00: configured for UDMA/133
Mar 13 06:13:23 alderan kernel: ata2: EH complete
Mar 13 06:13:23 alderan kernel: SCSI device sdb: 976773168 512-byte hdwr
sectors (500108 MB)
Mar 13 06:13:23 alderan kernel: sdb: Write Protect is off
Mar 13 06:13:23 alderan kernel: SCSI device sdb: write cache: enabled,
read cache: enabled, doesn't support DPO or FUA

If Mikael does have an updated patch for the more detailed error reporting
features, I'll try to run it in a few days of time whenever I get my hands
on it. I would be really interested to know why the Promise Sata300TX4
doesn't play along the newer 500GB Seagate 7200.10 disks while the older
models are Ok (I've already tried with and without 1,5Gbps jumpers and
patches).

Regards,
Tomi Orava


-- 



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-03-13  7:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-07 14:32 sata_promise SATA300TX4 "intermittent problems" Peter Favrholdt
2007-03-07 20:12 ` Mikael Pettersson
2007-03-08 16:26   ` Peter Favrholdt
2007-03-09  6:27     ` Peter Favrholdt
2007-03-09  7:01       ` Tomi Orava
2007-03-09  7:29         ` Peter Favrholdt
2007-03-13  7:11       ` Tomi Orava

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).