* Problems with >3 drives on an eSATA portmultiplier
@ 2009-04-01 13:54 Justin Fletcher
2009-04-01 16:45 ` Grant Grundler
0 siblings, 1 reply; 8+ messages in thread
From: Justin Fletcher @ 2009-04-01 13:54 UTC (permalink / raw)
To: linux-ide
Hiya,
I've been having problems recently with my external eSATA drives failing
to be recognised when there are more than 3 plugged in at one time.
Summary of problem:
When one drive is connected in the external box, everything is fine.
When two are connected, everything is fine.
When three are connected, it can sometimes take a while for them all to
be detected and mounted.
When four are connected, it almost never detects them properly or mounts
them. Occassionally I get all 4 mounted, and rarely I get just 1 or 2 of
the drives mounted.
When five are connected, it's not mounting the drives.
More details:
The kernel I'm using now is 2.6.29 with no patches applied.
The system I'm using is a MSI motherboard, with a SiI eSATA controller
(a 3132, specifically this
one: http://www.span.com/catalog/product_info.php?products_id=15995 )
connected though the only PCI express card on the MB.
The bridgeboard in my external box is a NA910C, with a SiI3726 onboard
(specifically this
one: http://www.span.com/catalog/product_info.php?products_id=15709 ).
The method of disconnecting the drives is to remove the SATA cable from
the bridge board.
The eSATA cable has been replaced with another one (both 1M long) and
this has had no effect.
All the drives in the external box are Western Digital. 3 are 500G
drives, 2 are 1T 'Green Power' drives.
Once detected, the drives are mounted (and subsequently unmounted) by
udev rules.
History:
The full 5 drives were working and being mounted correctly in the past.
However, due to many upgrades and confusing hardware problems at the
same time, trying to identify when that was has become a problem for me
- I can't say when it was working. When it was working I had a JMB362
PCIexpress card (specifically this one:
http://www.span.com/catalog/product_info.php?products_id=16361 ). This
has been replaced by the SiI card in order to determine if the card is a
problem; the problems persist and have the same symptoms. (should it be
necessary for diagnosis, I can put the JMB362 card back). I can say for
certain that the failures I'm seeing have happened at least on kernels
2.6.28.3, 2.6.28.4 and 2.6.29.
During testing combinations of drives have been changed, and the bridge
board ports that they are plugged in to. This has not appeared to make
any difference - the factor in this equation is the number of drives
that are connected.
Typical failure:
A typical reads something like this (taken from kern.log from messages
collected during initialisation):
Apr 1 11:43:23 buttercup kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.02: failed to write SCR 1 (Emask=0x1)
Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.03: hardreset failed (port not ready)
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state, performing PORT_RST
Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to write SCR 2 (Emask=0x1)
Apr 1 11:43:23 buttercup kernel: ata1.03: COMRESET failed (errno=-5)
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state, performing PORT_RST
Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 2 (Emask=0x1)
Apr 1 11:43:23 buttercup kernel: ata1.03: COMRESET failed (errno=-5)
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to recover link after 3 tries, disabling
Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state, performing PORT_RST
... and so on until it tries detaching the port multiplier ...
Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state, performing PORT_RST
Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: failed to read SCR 2 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.01: COMRESET failed (errno=-5)
Apr 1 11:43:23 buttercup kernel: ata1.01: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.01: reset failed, giving up
Apr 1 11:43:23 buttercup kernel: ata1.01: failed to recover link after 3 tries, disabling
Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.04: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.04: COMRESET failed (errno=-5)
Apr 1 11:43:23 buttercup kernel: ata1.04: failed to write SCR 1 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.04: failed to clear SError.N (errno=-5)
Apr 1 11:43:23 buttercup kernel: ata1: failed to recover PMP after 5 tries, giving up
Apr 1 11:43:23 buttercup kernel: ata1.15: Port Multiplier detaching
Apr 1 11:43:23 buttercup kernel: ata1.00: disabled
Apr 1 11:43:23 buttercup kernel: ata1: exception Emask 0x13 SAct 0x0 SErr 0x40d0000 action 0xe frozen t4
Apr 1 11:43:23 buttercup kernel: ata1: irq_stat 0x01100010, PHY RDY changed
Apr 1 11:43:23 buttercup kernel: ata1: SError: { PHYRdyChg CommWake 10B8B DevExch }
Apr 1 11:43:23 buttercup kernel: ata1: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: COMRESET failed (errno=-5)
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x1)
Apr 1 11:43:23 buttercup kernel: ata1.02: COMRESET failed (errno=-5)
Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.02: reset failed, giving up
Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x1)
Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 1 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.03: hardreset failed (port not ready)
Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state, performing PORT_RST
... and the sequence repeats until it gets fed up ...
Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state, performing PORT_RST
Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link down (SStatus 221 SControl 300)
Apr 1 11:43:23 buttercup kernel: ata1.05: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.00: ATA-8: WDC WD5000AAKS-00YGA0, 12.01C02, max UDMA/133
Apr 1 11:43:23 buttercup kernel: ata1.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 31/32)
Apr 1 11:43:23 buttercup kernel: ata1.00: configured for UDMA/100
Apr 1 11:43:23 buttercup kernel: ata1.04: PHY status changed but maxed out on retries, giving up
Apr 1 11:43:23 buttercup kernel: ata1.04: Manully issue scan to resume this link
Apr 1 11:43:23 buttercup kernel: ata1: PMP SError.N set for some ports, repeating recovery
Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Apr 1 11:43:23 buttercup kernel: ata1.00: configured for UDMA/100
Apr 1 11:43:23 buttercup kernel: ata1: EH pending after 5 tries, giving up
Apr 1 11:43:23 buttercup kernel: ata1: EH complete
As can be seen, it got as far as identifying one of the drives in this
configuration on the final attempt, but the other 3 were not detected
properly.
My gut feeling:
There's some timing problem involved here - either the drives are being
sent commands when they're not ready, or they're being timed out before
they have a chance to respond after a reset. As the problem gets worse
(to the point of always failing) with more drives, I'm thinking of some
overall timeout that's being triggered but the individual drives are
getting less and less time to handle it. For example, drive 1 reset at
1s, drive 2 reset at 2s, drive 3 reset at 3s, etc, but an overall
timeout of 8s, so by the time that drive 5 has been reset, it only has
3s to respond and its initialisation takes longer than that so it never
does). Not knowing what is involved here, this may be complete rubbish
and is purely guesswork on my part.
More details from kernel logs:
Because I'm not sure what's useful, and I wanted to capture some timings
for the sequences of events, I've captured kernel logs of the a number
of drive combinations. In each case the PC was turned off, the box was
turned off, the SATA leads were connected as required for the test, then
the box turned on, a few seconds waited for the box to settle, then the
PC turned on. The system booted into 2.6.29 and then waited until it had
settled to a login prompt. At this point, the drive box was turned off.
The system then shut down whatever drives it had detected after
determining that the PMP had gone away. The drive box was then turned on
again. This second initialisation of the box should ensure that there
are timings present in the kernel logs which determine how long it was
between events.
The numbering of the logs indicates which drives were connected - these
are drives numbers from 1-5, not the numbers used in the log messages
which are 0-4 (it just makes more sense for me to think of them as
drives 1-5 not 0-4).
Drives 1-3 are 500G, drives 4-5 are 1T.
In the logs it can also be seen that there are two ATA drives connected
to the MB, and two SATA drives connected to the MB. Neither of these
appear to exhibit any other problems.
The logs can be found at:
http://usenet.gerph.org/SATA/
sata-15-kern.log:
2 drives connected.
All detected during initialisation.
All detected on restarting box.
sata-45-kern.log:
2 drives connected.
All detected during initialisation.
All detected on restarting box, although it reset the port 3 times.
sata-125-kern.log:
3 drives connected.
All detected during initialisation, but after doing so it then tried
to re-detect later (which was successful)
All detected on restarting box, although it reset the port 2 times
and had SCSI errors reported which it recovered from.
sata-345-kern.log:
3 drives connected.
1 detected during initialisation, only drive 4 was initialised
properly; during init 3 had been IDENTIFYd but the port was then
reset and more attempts made.
All detected on restarting box, although it reset the port 2 times
and had other errors reported which it recovered from.
sata-1235-kern.log:
4 drives connected.
1 detected during initialisation (drive 1), many attempts made.
None detected on restarting box, although it retried many times.
sata-12345-kern.log:
5 drives connected.
None detected during initialisation, many attempts made.
Ineffective - no output when the external box was turned off, nor
when it was turned on.
Finally:
I can provide more information, more combinations and try different
kernel configurations if it's found to be useful for this. I'm sorry if
this information is too verbose, or if I've missed something out -
please let me know and I'll try to do tests or fill in the blanks.
Hope someone can help with this!
--
Gerph <http://gerph.org/>
[ All information, speculation, opinion or data within, or attached to,
this email is private and confidential. Such content may not be
disclosed to third parties, or a public forum, without explicit
permission being granted. ]
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Problems with >3 drives on an eSATA portmultiplier
2009-04-01 13:54 Problems with >3 drives on an eSATA portmultiplier Justin Fletcher
@ 2009-04-01 16:45 ` Grant Grundler
2009-04-01 17:09 ` Justin Fletcher
0 siblings, 1 reply; 8+ messages in thread
From: Grant Grundler @ 2009-04-01 16:45 UTC (permalink / raw)
To: Justin Fletcher; +Cc: linux-ide
On Wed, Apr 1, 2009 at 6:54 AM, Justin Fletcher <gerph@gerph.org> wrote:
> Hiya,
>
> I've been having problems recently with my external eSATA drives failing to
> be recognised when there are more than 3 plugged in at one time.
>
> Summary of problem:
>
> When one drive is connected in the external box, everything is fine.
>
> When two are connected, everything is fine.
>
> When three are connected, it can sometimes take a while for them all to be
> detected and mounted.
>
> When four are connected, it almost never detects them properly or mounts
> them. Occassionally I get all 4 mounted, and rarely I get just 1 or 2 of the
> drives mounted.
>
> When five are connected, it's not mounting the drives.
>
>
> More details:
>
> The kernel I'm using now is 2.6.29 with no patches applied.
>
> The system I'm using is a MSI motherboard, with a SiI eSATA controller (a
> 3132, specifically this one:
> http://www.span.com/catalog/product_info.php?products_id=15995 ) connected
> though the only PCI express card on the MB.
2.6.29 kernel + SII 3132 SATA controller should work fine with 3726 PMP.
I'm skeptical it's a driver problem. But I've not tested recent
kernels with that config.
I do know 2.6.26 does work with that config.
...
> History:
>
> The full 5 drives were working and being mounted correctly in the past.
> However, due to many upgrades and confusing hardware problems at the same
> time, trying to identify when that was has become a problem for me - I can't
> say when it was working. When it was working I had a JMB362 PCIexpress card
> (specifically this one:
> http://www.span.com/catalog/product_info.php?products_id=16361 ). This has
> been replaced by the SiI card in order to determine if the card is a
> problem; the problems persist and have the same symptoms. (should it be
> necessary for diagnosis, I can put the JMB362 card back). I can say for
> certain that the failures I'm seeing have happened at least on kernels
> 2.6.28.3, 2.6.28.4 and 2.6.29.
>
> During testing combinations of drives have been changed, and the bridge
> board ports that they are plugged in to. This has not appeared to make any
> difference - the factor in this equation is the number of drives that are
> connected.
This suggests the power supply is now failing to provide adequate power
for drive spinup. The WD "Green" drives certainly use less power during
normal operation (IIRC, they are 5400 RPM and only 3 platter). But they
will need substantially more to spinup.
Happen to have another PSU that could provide power to the drives?
Ie build the same topology with 3132 + 3726 but power it with a
different (or multiple) PSU.
hth,
grant
>
>
> Typical failure:
>
> A typical reads something like this (taken from kern.log from messages
> collected during initialisation):
>
> Apr 1 11:43:23 buttercup kernel: ata1: SATA link up 3.0 Gbps (SStatus 123
> SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.15: Port Multiplier 1.1,
> 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.02: failed to write SCR 1 (Emask=0x1)
> Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.03: hardreset failed (port not ready)
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state,
> performing PORT_RST
> Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus
> 123 SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.02: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to write SCR 2 (Emask=0x1)
> Apr 1 11:43:23 buttercup kernel: ata1.03: COMRESET failed (errno=-5)
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state,
> performing PORT_RST
> Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus
> 123 SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.02: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 2 (Emask=0x1)
> Apr 1 11:43:23 buttercup kernel: ata1.03: COMRESET failed (errno=-5)
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to recover link after 3
> tries, disabling
> Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state,
> performing PORT_RST
>
> ... and so on until it tries detaching the port multiplier ...
>
> Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state,
> performing PORT_RST
> Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus
> 123 SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: failed to read SCR 2 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.01: COMRESET failed (errno=-5)
> Apr 1 11:43:23 buttercup kernel: ata1.01: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.01: reset failed, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.01: failed to recover link after 3
> tries, disabling
> Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus
> 123 SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.04: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.04: COMRESET failed (errno=-5)
> Apr 1 11:43:23 buttercup kernel: ata1.04: failed to write SCR 1
> (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.04: failed to clear SError.N
> (errno=-5)
> Apr 1 11:43:23 buttercup kernel: ata1: failed to recover PMP after 5 tries,
> giving up
> Apr 1 11:43:23 buttercup kernel: ata1.15: Port Multiplier detaching
> Apr 1 11:43:23 buttercup kernel: ata1.00: disabled
> Apr 1 11:43:23 buttercup kernel: ata1: exception Emask 0x13 SAct 0x0 SErr
> 0x40d0000 action 0xe frozen t4
> Apr 1 11:43:23 buttercup kernel: ata1: irq_stat 0x01100010, PHY RDY changed
> Apr 1 11:43:23 buttercup kernel: ata1: SError: { PHYRdyChg CommWake 10B8B
> DevExch }
> Apr 1 11:43:23 buttercup kernel: ata1: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1: SATA link up 3.0 Gbps (SStatus 123
> SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.15: Port Multiplier 1.1,
> 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.02: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: COMRESET failed (errno=-5)
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus
> 123 SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x1)
> Apr 1 11:43:23 buttercup kernel: ata1.02: COMRESET failed (errno=-5)
> Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.02: reset failed, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus
> 123 SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link up 3.0 Gbps (SStatus
> 123 SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.02: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x1)
> Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 1 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.02: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.03: hardreset failed (port not ready)
> Apr 1 11:43:23 buttercup kernel: ata1.03: failed to read SCR 0 (Emask=0x40)
> Apr 1 11:43:23 buttercup kernel: ata1.03: reset failed, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.15: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state,
> performing PORT_RST
>
> ... and the sequence repeats until it gets fed up ...
>
> Apr 1 11:43:23 buttercup kernel: ata1: controller in dubious state,
> performing PORT_RST
> Apr 1 11:43:23 buttercup kernel: ata1.15: SATA link up 3.0 Gbps (SStatus
> 123 SControl 0)
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.01: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.01: SATA link down (SStatus 221
> SControl 300)
> Apr 1 11:43:23 buttercup kernel: ata1.05: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.05: SATA link up 1.5 Gbps (SStatus
> 113 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.00: ATA-8: WDC WD5000AAKS-00YGA0,
> 12.01C02, max UDMA/133
> Apr 1 11:43:23 buttercup kernel: ata1.00: 976773168 sectors, multi 16:
> LBA48 NCQ (depth 31/32)
> Apr 1 11:43:23 buttercup kernel: ata1.00: configured for UDMA/100
> Apr 1 11:43:23 buttercup kernel: ata1.04: PHY status changed but maxed out
> on retries, giving up
> Apr 1 11:43:23 buttercup kernel: ata1.04: Manully issue scan to resume this
> link
> Apr 1 11:43:23 buttercup kernel: ata1: PMP SError.N set for some ports,
> repeating recovery
> Apr 1 11:43:23 buttercup kernel: ata1.00: hard resetting link
> Apr 1 11:43:23 buttercup kernel: ata1.00: SATA link up 3.0 Gbps (SStatus
> 123 SControl 320)
> Apr 1 11:43:23 buttercup kernel: ata1.00: configured for UDMA/100
> Apr 1 11:43:23 buttercup kernel: ata1: EH pending after 5 tries, giving up
> Apr 1 11:43:23 buttercup kernel: ata1: EH complete
>
>
> As can be seen, it got as far as identifying one of the drives in this
> configuration on the final attempt, but the other 3 were not detected
> properly.
>
>
> My gut feeling:
>
> There's some timing problem involved here - either the drives are being sent
> commands when they're not ready, or they're being timed out before they have
> a chance to respond after a reset. As the problem gets worse (to the point
> of always failing) with more drives, I'm thinking of some overall timeout
> that's being triggered but the individual drives are getting less and less
> time to handle it. For example, drive 1 reset at 1s, drive 2 reset at 2s,
> drive 3 reset at 3s, etc, but an overall timeout of 8s, so by the time that
> drive 5 has been reset, it only has 3s to respond and its initialisation
> takes longer than that so it never does). Not knowing what is involved here,
> this may be complete rubbish and is purely guesswork on my part.
>
>
> More details from kernel logs:
>
> Because I'm not sure what's useful, and I wanted to capture some timings for
> the sequences of events, I've captured kernel logs of the a number of drive
> combinations. In each case the PC was turned off, the box was turned off,
> the SATA leads were connected as required for the test, then the box turned
> on, a few seconds waited for the box to settle, then the PC turned on. The
> system booted into 2.6.29 and then waited until it had settled to a login
> prompt. At this point, the drive box was turned off. The system then shut
> down whatever drives it had detected after determining that the PMP had gone
> away. The drive box was then turned on again. This second initialisation of
> the box should ensure that there are timings present in the kernel logs
> which determine how long it was between events.
>
> The numbering of the logs indicates which drives were connected - these are
> drives numbers from 1-5, not the numbers used in the log messages which are
> 0-4 (it just makes more sense for me to think of them as drives 1-5 not
> 0-4).
>
> Drives 1-3 are 500G, drives 4-5 are 1T.
>
> In the logs it can also be seen that there are two ATA drives connected to
> the MB, and two SATA drives connected to the MB. Neither of these appear to
> exhibit any other problems.
>
> The logs can be found at:
>
> http://usenet.gerph.org/SATA/
>
>
> sata-15-kern.log:
> 2 drives connected.
> All detected during initialisation.
> All detected on restarting box.
>
> sata-45-kern.log:
> 2 drives connected.
> All detected during initialisation.
> All detected on restarting box, although it reset the port 3 times.
>
> sata-125-kern.log:
> 3 drives connected.
> All detected during initialisation, but after doing so it then tried
> to re-detect later (which was successful)
> All detected on restarting box, although it reset the port 2 times
> and had SCSI errors reported which it recovered from.
>
> sata-345-kern.log:
> 3 drives connected.
> 1 detected during initialisation, only drive 4 was initialised
> properly; during init 3 had been IDENTIFYd but the port was then
> reset and more attempts made.
> All detected on restarting box, although it reset the port 2 times
> and had other errors reported which it recovered from.
>
> sata-1235-kern.log:
> 4 drives connected.
> 1 detected during initialisation (drive 1), many attempts made.
> None detected on restarting box, although it retried many times.
>
> sata-12345-kern.log:
> 5 drives connected.
> None detected during initialisation, many attempts made.
> Ineffective - no output when the external box was turned off, nor
> when it was turned on.
>
>
> Finally:
>
> I can provide more information, more combinations and try different kernel
> configurations if it's found to be useful for this. I'm sorry if this
> information is too verbose, or if I've missed something out - please let me
> know and I'll try to do tests or fill in the blanks.
>
>
> Hope someone can help with this!
>
>
> --
> Gerph <http://gerph.org/>
> [ All information, speculation, opinion or data within, or attached to,
> this email is private and confidential. Such content may not be
> disclosed to third parties, or a public forum, without explicit
> permission being granted. ]
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Problems with >3 drives on an eSATA portmultiplier
2009-04-01 16:45 ` Grant Grundler
@ 2009-04-01 17:09 ` Justin Fletcher
2009-04-01 17:46 ` Grant Grundler
0 siblings, 1 reply; 8+ messages in thread
From: Justin Fletcher @ 2009-04-01 17:09 UTC (permalink / raw)
To: Grant Grundler; +Cc: linux-ide
On Wed, 1 Apr 2009, Grant Grundler wrote:
> On Wed, Apr 1, 2009 at 6:54 AM, Justin Fletcher <gerph@gerph.org> wrote:
>> Hiya,
>>
>>
>> The system I'm using is a MSI motherboard, with a SiI eSATA controller (a
>> 3132, specifically this one:
>> http://www.span.com/catalog/product_info.php?products_id=15995 ) connected
>> though the only PCI express card on the MB.
>
> 2.6.29 kernel + SII 3132 SATA controller should work fine with 3726 PMP.
>
> I'm skeptical it's a driver problem. But I've not tested recent
> kernels with that config.
> I do know 2.6.26 does work with that config.
I could drop down to 2.6.26 and rebuild my kernel with support for that
controller - I hadn't done this before because I cannot then use my DVB-S
card, but... I could live without that for the test.
I'm unsure of where the problem might lie; if it's not a driver problem
and the PSU isn't the issue, then the only variable left (I think) is the
bridge board itself. AKA "The expensive bit" :-)
[snip]
>>
>> During testing combinations of drives have been changed, and the bridge
>> board ports that they are plugged in to. This has not appeared to make any
>> difference - the factor in this equation is the number of drives that are
>> connected.
>
> This suggests the power supply is now failing to provide adequate power
> for drive spinup. The WD "Green" drives certainly use less power during
> normal operation (IIRC, they are 5400 RPM and only 3 platter). But they
> will need substantially more to spinup.
>
> Happen to have another PSU that could provide power to the drives?
> Ie build the same topology with 3132 + 3726 but power it with a
> different (or multiple) PSU.
I've replaced the PSU once already - it was replaced on 6th Feb, with a
Sumvision 450W 20+4pin SATA PSU. Those 5 drives + the bridge board and fan
are the only things being powered by that PSU. I should have thought that
even during drive initialisation 5 drives wouldn't exceed 450W. Anyhow,
this was one of my first thoughts. I could get another PSU (I think I
might be able to find one around here), but having replaced it so
recently, I'm uncomfortable doing so.
Is a 450W PSU going to be able to supply enough power for those drives ?
--
Gerph <http://gerph.org/>
[ All information, speculation, opinion or data within, or attached to,
this email is private and confidential. Such content may not be
disclosed to third parties, or a public forum, without explicit
permission being granted. ]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with >3 drives on an eSATA portmultiplier
2009-04-01 17:09 ` Justin Fletcher
@ 2009-04-01 17:46 ` Grant Grundler
2009-04-02 0:28 ` Tejun Heo
0 siblings, 1 reply; 8+ messages in thread
From: Grant Grundler @ 2009-04-01 17:46 UTC (permalink / raw)
To: Justin Fletcher; +Cc: linux-ide
On Wed, Apr 1, 2009 at 10:09 AM, Justin Fletcher <gerph@gerph.org> wrote:
> On Wed, 1 Apr 2009, Grant Grundler wrote:
>
>> On Wed, Apr 1, 2009 at 6:54 AM, Justin Fletcher <gerph@gerph.org> wrote:
>>>
>>> Hiya,
>>>
>>>
>>> The system I'm using is a MSI motherboard, with a SiI eSATA controller (a
>>> 3132, specifically this one:
>>> http://www.span.com/catalog/product_info.php?products_id=15995 )
>>> connected
>>> though the only PCI express card on the MB.
>>
>> 2.6.29 kernel + SII 3132 SATA controller should work fine with 3726 PMP.
>>
>> I'm skeptical it's a driver problem. But I've not tested recent
>> kernels with that config.
>> I do know 2.6.26 does work with that config.
>
> I could drop down to 2.6.26 and rebuild my kernel with support for that
> controller - I hadn't done this before because I cannot then use my DVB-S
> card, but... I could live without that for the test.
If someone could comment on 2.6.28 or 2.6.29, you could avoid running
this test.
...
> I've replaced the PSU once already - it was replaced on 6th Feb, with a
> Sumvision 450W 20+4pin SATA PSU. Those 5 drives + the bridge board and fan
> are the only things being powered by that PSU. I should have thought that
> even during drive initialisation 5 drives wouldn't exceed 450W. Anyhow, this
> was one of my first thoughts. I could get another PSU (I think I might be
> able to find one around here), but having replaced it so recently, I'm
> uncomfortable doing so.
>
> Is a 450W PSU going to be able to supply enough power for those drives ?
It's not just about how many watts. Both 12v and 5v "rails" need to
provide enough power.
But if this worked before, I would assume the replacement is working too.
And in general, yes, a 450W should be plenty for 5 drives plus PMP
board. Especially if
the OS is "staggering" the spinup so only one or two drives are
spinning up at the same time.
I know I've had to fix "staggered spinup" in the past (2.6.18) but
don't recall if it's fixed in current 2.6.28 or .29 kernels.
grant
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with >3 drives on an eSATA portmultiplier
2009-04-01 17:46 ` Grant Grundler
@ 2009-04-02 0:28 ` Tejun Heo
2009-04-03 21:06 ` Justin Fletcher
0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2009-04-02 0:28 UTC (permalink / raw)
To: Grant Grundler; +Cc: Justin Fletcher, linux-ide
Grant Grundler wrote:
>> I could drop down to 2.6.26 and rebuild my kernel with support for that
>> controller - I hadn't done this before because I cannot then use my DVB-S
>> card, but... I could live without that for the test.
>
> If someone could comment on 2.6.28 or 2.6.29, you could avoid running
> this test.
Hmmm... I don't really remember changing much, so it should be fine.
Can you please give a shot at 2.6.26 anyway? Also, kernel logs with
JMB controller would be helpful too as ahci tends to better show what's
really going on and failing during reset.
>> I've replaced the PSU once already - it was replaced on 6th Feb, with a
>> Sumvision 450W 20+4pin SATA PSU. Those 5 drives + the bridge board and fan
>> are the only things being powered by that PSU. I should have thought that
>> even during drive initialisation 5 drives wouldn't exceed 450W. Anyhow, this
>> was one of my first thoughts. I could get another PSU (I think I might be
>> able to find one around here), but having replaced it so recently, I'm
>> uncomfortable doing so.
>>
>> Is a 450W PSU going to be able to supply enough power for those drives ?
>
> It's not just about how many watts. Both 12v and 5v "rails" need to
> provide enough power. But if this worked before, I would assume the
> replacement is working too.
>
> And in general, yes, a 450W should be plenty for 5 drives plus PMP
> board. Especially if the OS is "staggering" the spinup so only one
> or two drives are spinning up at the same time. I know I've had to
> fix "staggered spinup" in the past (2.6.18) but don't recall if it's
> fixed in current 2.6.28 or .29 kernels.
libata never had proper provision for staggered spin up. If it works,
it's just because they're being probed sequentially. :-)
Anyways, 3132 + 3726 should work. Both 3726 and 4726 pretty much
require its first port to be always occupied (or it's fake config
drive jumps arounds and freaks out) and hotplug is often flaky but
other than that it should generally work.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with >3 drives on an eSATA portmultiplier
2009-04-02 0:28 ` Tejun Heo
@ 2009-04-03 21:06 ` Justin Fletcher
2009-04-07 21:20 ` Justin Fletcher
0 siblings, 1 reply; 8+ messages in thread
From: Justin Fletcher @ 2009-04-03 21:06 UTC (permalink / raw)
To: Tejun Heo; +Cc: Grant Grundler, linux-ide
On Thu, 2 Apr 2009, Tejun Heo wrote:
> Grant Grundler wrote:
>>> I could drop down to 2.6.26 and rebuild my kernel with support for that
>>> controller - I hadn't done this before because I cannot then use my DVB-S
>>> card, but... I could live without that for the test.
>>
>> If someone could comment on 2.6.28 or 2.6.29, you could avoid running
>> this test.
>
> Hmmm... I don't really remember changing much, so it should be fine.
> Can you please give a shot at 2.6.26 anyway? Also, kernel logs with
> JMB controller would be helpful too as ahci tends to better show what's
> really going on and failing during reset.
I tried the JMB controller again with 2.6.26, and with drives 1,2,3
connected. This was different to what had happened previously - there are
two lights on the bridgeboard, green and red. Most of the time they're
both on. When the JMB controller was connected (even in the BIOS screen)
these lights flashed on and off about once a second. Occassionally they'd
settle into the on state. The log of booting into 2.6.26 is here:
http://usenet.gerph.org/SATA/sata-123-2.6.26-jmb-kern.log
and mostly consists of:
Apr 3 20:33:43 buttercup kernel: ata1: exception Emask 0x10 SAct 0x0 SErr 0x4000000 action 0xe frozen
Apr 3 20:33:43 buttercup kernel: ata1: irq_stat 0x00000040, connection status changed
Apr 3 20:33:43 buttercup kernel: ata1: SError: { DevExch }
Apr 3 20:33:43 buttercup kernel: ata1: hard resetting link
Apr 3 20:33:44 buttercup kernel: ata1: SATA link down (SStatus 0 SControl 300)
Apr 3 20:33:44 buttercup kernel: ata1: EH complete
repeated over and over whilst the light was flashing (not in time with
it).
An interesting (?) thing here is that at 20:34:27 in the log the light
stayed on for a bit and we managed to read details from the drives. Then
the light went back to flashing regularly again.
I also booted into 2.6.29 with this configuration (but with drives 3,4,5)
and got a similar effect, except for the constant light:
http://usenet.gerph.org/SATA/sata-345-2.6.29-jmb-kern.log
I don't think it's particularly surprising that this exhibits the same
results, given that even during the BIOS the light flashes in the same
way.
I removed and re-fitted the JMB card as part of these tests, in case it
was not seated properly in the slot (unlikely, as it was detected, but
just to be sure, I did this anyhow). I also exchanged the eSATA cables to
try to eliminate them. No change in behaviour was observed.
Having tried this, I reinserted the SiI card and rebuilt 2.6.26 with
support for it (and increased the debug buffer). Whilst in the BIOS and
after booting normally, the bridge board lights are almost constantly on.
They only seem to go off when the link is reset. Booting with drives
1,2,3,4 connected (which should be a failure, or only partially detect the
drives according to the 2.6.29 results) gave me:
http://usenet.gerph.org/SATA/sata-1234-2.6.26-sil-kern.log
which did many resets during the initialisation and never detected any
drives. Turning the external box off and on again did not cause it to
re-detect the drives (as if it was just completely ignoring the interface
now).
Repeating the test with drives 3,4,5 and the 2.6.26 kernel and the sil
card gave me the following log:
http://usenet.gerph.org/SATA/sata-345-2.6.26-sil-kern.log
This managed to identify 2 of the 3 drives during initialisation. When the
box was turned off and on again it only detected 2 of the 3.
>>> I've replaced the PSU once already - it was replaced on 6th Feb, with a
>>> Sumvision 450W 20+4pin SATA PSU. Those 5 drives + the bridge board and fan
>>> are the only things being powered by that PSU. I should have thought that
>>> even during drive initialisation 5 drives wouldn't exceed 450W. Anyhow, this
>>> was one of my first thoughts. I could get another PSU (I think I might be
>>> able to find one around here), but having replaced it so recently, I'm
>>> uncomfortable doing so.
>>>
>>> Is a 450W PSU going to be able to supply enough power for those drives ?
>>
>> It's not just about how many watts. Both 12v and 5v "rails" need to
>> provide enough power. But if this worked before, I would assume the
>> replacement is working too.
>>
>> And in general, yes, a 450W should be plenty for 5 drives plus PMP
>> board. Especially if the OS is "staggering" the spinup so only one
>> or two drives are spinning up at the same time. I know I've had to
>> fix "staggered spinup" in the past (2.6.18) but don't recall if it's
>> fixed in current 2.6.28 or .29 kernels.
>
> libata never had proper provision for staggered spin up. If it works,
> it's just because they're being probed sequentially. :-)
>
> Anyways, 3132 + 3726 should work. Both 3726 and 4726 pretty much
> require its first port to be always occupied (or it's fake config
> drive jumps arounds and freaks out) and hotplug is often flaky but
> other than that it should generally work.
So my using ports 3,4,5 only probably isn't all that useful for it ?
Even still, I'm not sure that the behaviour that I'm seeing is down to a
driver issue now. I'd appreciate any other thoughts based on the extra
info I've given. I may just bite the bullet and try a replacement bridge
board and see if that helps.
--
Gerph <http://gerph.org/>
[ All information, speculation, opinion or data within, or attached to,
this email is private and confidential. Such content may not be
disclosed to third parties, or a public forum, without explicit
permission being granted. ]
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Problems with >3 drives on an eSATA portmultiplier
2009-04-03 21:06 ` Justin Fletcher
@ 2009-04-07 21:20 ` Justin Fletcher
2009-04-14 10:45 ` Tejun Heo
0 siblings, 1 reply; 8+ messages in thread
From: Justin Fletcher @ 2009-04-07 21:20 UTC (permalink / raw)
To: Tejun Heo; +Cc: Grant Grundler, linux-ide
On Fri, 3 Apr 2009, Justin Fletcher wrote:
> On Thu, 2 Apr 2009, Tejun Heo wrote:
>
>> Grant Grundler wrote:
>>>> I could drop down to 2.6.26 and rebuild my kernel with support for that
>>>> controller - I hadn't done this before because I cannot then use my DVB-S
>>>> card, but... I could live without that for the test.
>>>
>>> If someone could comment on 2.6.28 or 2.6.29, you could avoid running
>>> this test.
>>
>> Hmmm... I don't really remember changing much, so it should be fine.
>> Can you please give a shot at 2.6.26 anyway? Also, kernel logs with
>> JMB controller would be helpful too as ahci tends to better show what's
>> really going on and failing during reset.
For future reference if anyone else sees similar issues with their
systems... I have replaced the bridge board with another and the problem
has gone away - all 5 drives are accessible and I have no issues.
Thanks for the help, and sorry to have wasted your time.
--
Gerph <http://gerph.org/>
[ All information, speculation, opinion or data within, or attached to,
this email is private and confidential. Such content may not be
disclosed to third parties, or a public forum, without explicit
permission being granted. ]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems with >3 drives on an eSATA portmultiplier
2009-04-07 21:20 ` Justin Fletcher
@ 2009-04-14 10:45 ` Tejun Heo
0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2009-04-14 10:45 UTC (permalink / raw)
To: Justin Fletcher; +Cc: Grant Grundler, linux-ide
Hello,
Justin Fletcher wrote:
> For future reference if anyone else sees similar issues with their
> systems... I have replaced the bridge board with another and the problem
> has gone away - all 5 drives are accessible and I have no issues.
>
> Thanks for the help, and sorry to have wasted your time.
Ah... thanks for reporting. Was worried whether you discovered
yet-unknown 3726 detection problem. :-)
--
tejun
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-04-14 10:45 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-01 13:54 Problems with >3 drives on an eSATA portmultiplier Justin Fletcher
2009-04-01 16:45 ` Grant Grundler
2009-04-01 17:09 ` Justin Fletcher
2009-04-01 17:46 ` Grant Grundler
2009-04-02 0:28 ` Tejun Heo
2009-04-03 21:06 ` Justin Fletcher
2009-04-07 21:20 ` Justin Fletcher
2009-04-14 10:45 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).