linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Spurious HD convictions
@ 2009-12-13  2:02 lrhorer
  2009-12-13  2:57 ` Majed B.
  0 siblings, 1 reply; 10+ messages in thread
From: lrhorer @ 2009-12-13  2:02 UTC (permalink / raw)
  To: linux-raid


	What's happening here?  Suddenly, my backup server is suffering apparently 
spurious hard drive convictions.  The server is running RAID5 on 7 disks 
under md.  It has been running well for months, but suddenly it has started 
kicking drives from the array when under moderately heavy read or write 
loads.  The thing is, it isn't convicting any particular drive repeatedly, 
and the drives are not showing any errors under SMART.  This is a PM system, 
and I have tried changing the drive adapters, changing the PMs, changing 
cables, moving the drives around, and moving them out of the CPU enclosure to 
a new external chassis.  The convictions are not occurring on any one 
channel, over any one particular PM, or over any particular cable.  Since 
this started happening, I have been unable to get all the way through a 
resync before the array dumps at least one of the drives.  Here is a sample 
from the kernel log during one of the convictions:

Dec 12 13:03:39 Backup kernel: [56319.397992] ata6.00: failed to read SCR 1 
(Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.397999] ata6.01: failed to 
read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.398001] 
ata6.02: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: 
[56319.398006] ata6.03: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 
Backup kernel: [56319.398008] ata6.04: failed to read SCR 1 (Emask=0x40) Dec 
12 13:03:39 Backup kernel: [56319.398010] ata6.05: failed to read SCR 1 
(Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.398014] ata6.15: exception 
Emask 0x4 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: 
[56319.398018] ata6.00: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 
frozen Dec 12 13:03:39 Backup kernel: [56319.398022] ata6.00: cmd 
ea/00:00:00:00:00/00:00:00:00:00/a0 tag 2
Dec 12 13:03:39 Backup kernel: [56319.398023]          res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 12 13:03:39 Backup kernel: [56319.398025] ata6.00: status: { DRDY } Dec 12 
13:03:39 Backup kernel: [56319.398028] ata6.01: exception Emask 0x100 SAct 
0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398031] 
ata6.02: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 
13:03:39 Backup kernel: [56319.398034] ata6.03: exception Emask 0x100 SAct 
0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398037] 
ata6.04: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 
13:03:39 Backup kernel: [56319.398040] ata6.05: exception Emask 0x100 SAct 
0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398044] 
ata6.15: hard resetting link Dec 12 13:03:41 Backup kernel: [56321.597384] 
ata6.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0) Dec 12 13:03:41 
Backup kernel: [56321.597864] ata6.00: hard resetting link Dec 12 13:03:42 
Backup kernel: [56321.933843] ata6.00: SATA link up 3.0 Gbps (SStatus 123 
SControl 320) Dec 12 13:03:42 Backup kernel: [56321.933849] ata6.01: hard 
resetting link Dec 12 13:03:42 Backup kernel: [56322.294048] ata6.01: SATA 
link up 3.0 Gbps (SStatus 123 SControl 300) Dec 12 13:03:42 Backup kernel: 
[56322.294055] ata6.02: hard resetting link Dec 12 13:03:42 Backup kernel: 
[56322.642243] ata6.02: SATA link down (SStatus 0 SControl 320) Dec 12 
13:03:42 Backup kernel: [56322.646087] ata6.03: hard resetting link Dec 12 
13:03:43 Backup kernel: [56323.006393] ata6.03: SATA link up 3.0 Gbps 
(SStatus 123 SControl 300) Dec 12 13:03:43 Backup kernel: [56323.006400] 
ata6.04: hard resetting link Dec 12 13:03:43 Backup kernel: [56323.354708] 
ata6.04: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Dec 12 13:03:43 
Backup kernel: [56323.354714] ata6.05: hard resetting link Dec 12 13:03:43 
Backup kernel: [56323.690211] ata6.05: SATA link up 1.5 Gbps (SStatus 113 
SControl 320) Dec 12 13:03:43 Backup kernel: [56323.694555] ata6.00: 
configured for UDMA/100 Dec 12 13:03:43 Backup kernel: [56323.695732] 
ata6.01: configured for UDMA/100 Dec 12 13:03:44 Backup kernel: 
[56323.703212] ata6.03: configured for UDMA/100 Dec 12 13:03:44 Backup 
kernel: [56323.803119] ata6.04: configured for UDMA/100 Dec 12 13:03:44 
Backup kernel: [56323.803188] ata6: EH complete Dec 12 13:03:44 Backup 
kernel: [56323.803119] sd 5:0:0:0: [sde] 2930277168 512-byte hardware sectors 
(1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] 
Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 
5:0:0:0: [sde] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: 
[56323.803119] sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 
5:1:0:0: [sdf] 2930277168 512-byte hardware sectors (1500302 MB) Dec 12 
13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write Protect is off 
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Mode Sense: 
00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] 
Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 
13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] 2930277168 512-byte 
hardware sectors (1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] 
sd 5:3:0:0: [sdg] Write Protect is off Dec 12 13:03:44 Backup kernel: 
[56323.803119] sd 5:3:0:0: [sdg] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 
Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write cache: enabled, read 
cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: 
[56323.803119] sd 5:4:0:0: [sdh] 625142448 512-byte hardware sectors (320073 
MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write 
Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: 
[sdh] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] 
sd 5:4:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] 
2930277168 512-byte hardware sectors (1500302 MB) Dec 12 13:03:44 Backup 
kernel: [56323.803119] sd 5:0:0:0: [sde] Write Protect is off Dec 12 13:03:44 
Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00 Dec 
12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write cache: 
enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 
Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] 2930277168 512-byte hardware 
sectors (1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 
5:1:0:0: [sdf] Write Protect is off Dec 12 13:03:44 Backup kernel: 
[56323.803119] sd 5:1:0:0: [sdf] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 
Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write cache: enabled, read 
cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: 
[56323.803119] sd 5:3:0:0: [sdg] 2930277168 512-byte hardware sectors 
(1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] 
Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 
5:3:0:0: [sdg] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: 
[56323.803119] sd 5:3:0:0: [sdg] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 
5:4:0:0: [sdh] 625142448 512-byte hardware sectors (320073 MB) Dec 12 
13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write Protect is off 
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Mode Sense: 
00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.807115] sd 5:4:0:0: [sdh] 
Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 
13:03:44 Backup kernel: [56323.839100] end_request: I/O error, dev sde, 
sector 10 Dec 12 13:03:44 Backup kernel: [56323.839100] md: super_written 
gets error=-5, uptodate=0 Dec 12 13:03:44 Backup kernel: [56323.839100] 
raid5: Disk failure on sde, disabling device.
Dec 12 13:03:44 Backup kernel: [56323.839100] raid5: Operation continuing on 6 
devices.

^ permalink raw reply	[flat|nested] 10+ messages in thread
* RE: Spurious HD convictions
@ 2009-12-13  3:44 lrhorer
  2009-12-14 20:06 ` Majed B.
  0 siblings, 1 reply; 10+ messages in thread
From: lrhorer @ 2009-12-13  3:44 UTC (permalink / raw)
  To: linux-raid

Hmm.  I don't see how it could be either the PS or the PMs, since the drives 
were moved to a new enclosure when the problem started happening, yet the 
problem persists.  The new chassis has all new PMs and of course a new PS, 
and the problem is happening across multiple PMs.  In addition, if NCQ is the 
problem, why has it just started happening?  This system has been up and 
running for the better part of a year.  Regardless, I have disabled NCQ by 
executing `echo 1 > /sys/block/sd[a-g]/device/queue_depth`, and I am 
attempting a repair action again.  We'll see how it goes.

> Hi Leslie,
> 
> According to some of the links here:
> http://www.google.com/search?hl=en&q=failed+to+read+SCR+1+(Emask%3D0x40)
> 
> It seem to be either the Power Supply Unit (PSU) or the Port Multiplier
> (PM).
> 
> A quick workaround seem to be disabling NCQ on all affected devices.
> 
> On Sun, Dec 13, 2009 at 5:02 AM, lrhorer@satx.rr.com
> <lrhorer@satx.rr.com> wrote:
> >
> >        What's happening here?  Suddenly, my backup server is suffering
> apparently
> > spurious hard drive convictions.  The server is running RAID5 on 7 disks
> > under md.  It has been running well for months, but suddenly it has
> started
> > kicking drives from the array when under moderately heavy read or write
> > loads.  The thing is, it isn't convicting any particular drive
> repeatedly,
> > and the drives are not showing any errors under SMART.  This is a PM
> system,
> > and I have tried changing the drive adapters, changing the PMs, changing
> > cables, moving the drives around, and moving them out of the CPU
> enclosure to
> > a new external chassis.  The convictions are not occurring on any one
> > channel, over any one particular PM, or over any particular cable.
>  Since
> > this started happening, I have been unable to get all the way through a
> > resync before the array dumps at least one of the drives.  Here is a
> sample
> > from the kernel log during one of the convictions:

^ permalink raw reply	[flat|nested] 10+ messages in thread
* Spurious HD convictions
@ 2009-12-13  2:07 lrhorer
  0 siblings, 0 replies; 10+ messages in thread
From: lrhorer @ 2009-12-13  2:07 UTC (permalink / raw)
  To: linux-raid


	What's happening here?  Suddenly, my backup server is suffering apparently spurious hard drive convictions.  The server is running RAID5 on 7 disks under md.  It has been running well for months, but suddenly it has started kicking drives from the array when under moderately heavy read or write loads.  The thing is, it isn't convicting any particular drive repeatedly, and the drives are not showing any errors under SMART.  This is a PM system, and I have tried changing the drive adapters, changing the PMs, changing cables, moving the drives around, and moving them out of the CPU enclosure to a new external chassis.  The convictions are not occurring on any one channel, over any one particular PM, or over any particular cable.  Since this started happening, I have been unable to get all th
 e way through a resync before the array dumps at least one of the drives.  Here is a sample from the kernel log during one of the convictions:

Dec 12 13:03:39 Backup kernel: [56319.397992] ata6.00: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.397999] ata6.01: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.398001] ata6.02: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.398006] ata6.03: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.398008] ata6.04: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.398010] ata6.05: failed to read SCR 1 (Emask=0x40) Dec 12 13:03:39 Backup kernel: [56319.398014] ata6.15: exception Emask 0x4 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398018] ata6.00: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.3980
 22] ata6.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 2
Dec 12 13:03:39 Backup kernel: [56319.398023]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 12 13:03:39 Backup kernel: [56319.398025] ata6.00: status: { DRDY } Dec 12 13:03:39 Backup kernel: [56319.398028] ata6.01: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398031] ata6.02: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398034] ata6.03: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398037] ata6.04: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398040] ata6.05: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 12 13:03:39 Backup kernel: [56319.398044] ata6.15: hard resetting link Dec 12 13:03:41 Backup kernel: [56321.597384] ata6.15: SATA link up 3.0 Gbps (SStatus 12
 3 SControl 0) Dec 12 13:03:41 Backup kernel: [56321.597864] ata6.00: hard resetting link Dec 12 13:03:42 Backup kernel: [56321.933843] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Dec 12 13:03:42 Backup kernel: [56321.933849] ata6.01: hard resetting link Dec 12 13:03:42 Backup kernel: [56322.294048] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Dec 12 13:03:42 Backup kernel: [56322.294055] ata6.02: hard resetting link Dec 12 13:03:42 Backup kernel: [56322.642243] ata6.02: SATA link down (SStatus 0 SControl 320) Dec 12 13:03:42 Backup kernel: [56322.646087] ata6.03: hard resetting link Dec 12 13:03:43 Backup kernel: [56323.006393] ata6.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Dec 12 13:03:43 Backup kernel: [56323.006400] ata6.04: hard resetting link De
 c 12 13:03:43 Backup kernel: [56323.354708] ata6.04: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Dec 12 13:03:43 Backup kernel: [56323.354714] ata6.05: hard resetting link Dec 12 13:03:43 Backup kernel: [56323.690211] ata6.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320) Dec 12 13:03:43 Backup kernel: [56323.694555] ata6.00: configured for UDMA/100 Dec 12 13:03:43 Backup kernel: [56323.695732] ata6.01: configured for UDMA/100 Dec 12 13:03:44 Backup kernel: [56323.703212] ata6.03: configured for UDMA/100 Dec 12 13:03:44 Backup kernel: [56323.803119] ata6.04: configured for UDMA/100 Dec 12 13:03:44 Backup kernel: [56323.803188] ata6: EH complete Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] 2930277168 512-byte hardware sectors (1500302 MB) Dec 12 13:03:44 Backup 
 kernel: [56323.803119] sd 5:0:0:0: [sde] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] 2930277168 512-byte hardware sectors (1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] 2930277168 512-byte hardware 
 sectors (1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] 625142448 512-byte hardware sectors (320073 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:
 0:0:0: [sde] 2930277168 512-byte hardware sectors (1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] 2930277168 512-byte hardware sectors (1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 
 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] 2930277168 512-byte hardware sectors (1500302 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] 625142448 512-byte hardware sectors (320073 MB) Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write Protect is off Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Mode Sense: 00 3a 00 00 Dec 12 13:03:44 Backup kernel: [56323.807115] sd 5:4:0:0: [sdh] Write cache: enabled, read cache: e
 nabled, doesn't support DPO or FUA Dec 12 13:03:44 Backup kernel: [56323.839100] end_request: I/O error, dev sde, sector 10 Dec 12 13:03:44 Backup kernel: [56323.839100] md: super_written gets error=-5, uptodate=0 Dec 12 13:03:44 Backup kernel: [56323.839100] raid5: Disk failure on sde, disabling device.
Dec 12 13:03:44 Backup kernel: [56323.839100] raid5: Operation continuing on 6 devices.


^ permalink raw reply	[flat|nested] 10+ messages in thread
* Spurious HD convictions
@ 2009-12-12 19:42 Leslie Rhorer
  0 siblings, 0 replies; 10+ messages in thread
From: Leslie Rhorer @ 2009-12-12 19:42 UTC (permalink / raw)
  To: linux-raid


	What's happening here?  Suddenly, my backup server is suffering
apparently spurious hard drive convictions.  The server is running RAID5 on
7 disks under md.  It has been running well for months, but suddenly it has
started kicking drives from the array when under moderately heavy read or
write loads.  The thing is, it isn't convicting any particular drive
repeatedly, and the drives are not showing any errors under SMART.  This is
a PM system, and I have tried changing the drive adapters, changing the PMs,
changing cables, moving the drives around, and moving them out of the CPU
enclosure to a new external chassis.  The convictions are not occurring on
any one channel, over any one particular PM, or over any particular cable.
Since this started happening, I have been unable to get all the way through
a resync before the array dumps at least one of the drives.  Here is a
sample from the kernel log during one of the convictions:

Dec 12 13:03:39 Backup kernel: [56319.397992] ata6.00: failed to read SCR 1
(Emask=0x40)
Dec 12 13:03:39 Backup kernel: [56319.397999] ata6.01: failed to read SCR 1
(Emask=0x40)
Dec 12 13:03:39 Backup kernel: [56319.398001] ata6.02: failed to read SCR 1
(Emask=0x40)
Dec 12 13:03:39 Backup kernel: [56319.398006] ata6.03: failed to read SCR 1
(Emask=0x40)
Dec 12 13:03:39 Backup kernel: [56319.398008] ata6.04: failed to read SCR 1
(Emask=0x40)
Dec 12 13:03:39 Backup kernel: [56319.398010] ata6.05: failed to read SCR 1
(Emask=0x40)
Dec 12 13:03:39 Backup kernel: [56319.398014] ata6.15: exception Emask 0x4
SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 12 13:03:39 Backup kernel: [56319.398018] ata6.00: exception Emask 0x100
SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 12 13:03:39 Backup kernel: [56319.398022] ata6.00: cmd
ea/00:00:00:00:00/00:00:00:00:00/a0 tag 2
Dec 12 13:03:39 Backup kernel: [56319.398023]          res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 12 13:03:39 Backup kernel: [56319.398025] ata6.00: status: { DRDY }
Dec 12 13:03:39 Backup kernel: [56319.398028] ata6.01: exception Emask 0x100
SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 12 13:03:39 Backup kernel: [56319.398031] ata6.02: exception Emask 0x100
SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 12 13:03:39 Backup kernel: [56319.398034] ata6.03: exception Emask 0x100
SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 12 13:03:39 Backup kernel: [56319.398037] ata6.04: exception Emask 0x100
SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 12 13:03:39 Backup kernel: [56319.398040] ata6.05: exception Emask 0x100
SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 12 13:03:39 Backup kernel: [56319.398044] ata6.15: hard resetting link
Dec 12 13:03:41 Backup kernel: [56321.597384] ata6.15: SATA link up 3.0 Gbps
(SStatus 123 SControl 0)
Dec 12 13:03:41 Backup kernel: [56321.597864] ata6.00: hard resetting link
Dec 12 13:03:42 Backup kernel: [56321.933843] ata6.00: SATA link up 3.0 Gbps
(SStatus 123 SControl 320)
Dec 12 13:03:42 Backup kernel: [56321.933849] ata6.01: hard resetting link
Dec 12 13:03:42 Backup kernel: [56322.294048] ata6.01: SATA link up 3.0 Gbps
(SStatus 123 SControl 300)
Dec 12 13:03:42 Backup kernel: [56322.294055] ata6.02: hard resetting link
Dec 12 13:03:42 Backup kernel: [56322.642243] ata6.02: SATA link down
(SStatus 0 SControl 320)
Dec 12 13:03:42 Backup kernel: [56322.646087] ata6.03: hard resetting link
Dec 12 13:03:43 Backup kernel: [56323.006393] ata6.03: SATA link up 3.0 Gbps
(SStatus 123 SControl 300)
Dec 12 13:03:43 Backup kernel: [56323.006400] ata6.04: hard resetting link
Dec 12 13:03:43 Backup kernel: [56323.354708] ata6.04: SATA link up 1.5 Gbps
(SStatus 113 SControl 300)
Dec 12 13:03:43 Backup kernel: [56323.354714] ata6.05: hard resetting link
Dec 12 13:03:43 Backup kernel: [56323.690211] ata6.05: SATA link up 1.5 Gbps
(SStatus 113 SControl 320)
Dec 12 13:03:43 Backup kernel: [56323.694555] ata6.00: configured for
UDMA/100
Dec 12 13:03:43 Backup kernel: [56323.695732] ata6.01: configured for
UDMA/100
Dec 12 13:03:44 Backup kernel: [56323.703212] ata6.03: configured for
UDMA/100
Dec 12 13:03:44 Backup kernel: [56323.803119] ata6.04: configured for
UDMA/100
Dec 12 13:03:44 Backup kernel: [56323.803188] ata6: EH complete
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] 2930277168
512-byte hardware sectors (1500302 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] 2930277168
512-byte hardware sectors (1500302 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] 2930277168
512-byte hardware sectors (1500302 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] 625142448
512-byte hardware sectors (320073 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] 2930277168
512-byte hardware sectors (1500302 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:0:0:0: [sde] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] 2930277168
512-byte hardware sectors (1500302 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:1:0:0: [sdf] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] 2930277168
512-byte hardware sectors (1500302 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:3:0:0: [sdg] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] 625142448
512-byte hardware sectors (320073 MB)
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Write
Protect is off
Dec 12 13:03:44 Backup kernel: [56323.803119] sd 5:4:0:0: [sdh] Mode Sense:
00 3a 00 00
Dec 12 13:03:44 Backup kernel: [56323.807115] sd 5:4:0:0: [sdh] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Dec 12 13:03:44 Backup kernel: [56323.839100] end_request: I/O error, dev
sde, sector 10
Dec 12 13:03:44 Backup kernel: [56323.839100] md: super_written gets
error=-5, uptodate=0
Dec 12 13:03:44 Backup kernel: [56323.839100] raid5: Disk failure on sde,
disabling device.
Dec 12 13:03:44 Backup kernel: [56323.839100] raid5: Operation continuing on
6 devices.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-12-16  9:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-13  2:02 Spurious HD convictions lrhorer
2009-12-13  2:57 ` Majed B.
  -- strict thread matches above, loose matches on Subject: below --
2009-12-13  3:44 lrhorer
2009-12-14 20:06 ` Majed B.
     [not found]   ` <4b271970.5e44f10a.484f.ffffdd07SMTPIN_ADDED@mx.google.com>
2009-12-15  8:47     ` Majed B.
2009-12-16  5:40       ` Leslie Rhorer
2009-12-16  5:41   ` Leslie Rhorer
2009-12-16  9:13     ` Robin Hill
2009-12-13  2:07 lrhorer
2009-12-12 19:42 Leslie Rhorer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).