Linux RAID subsystem development
 help / color / mirror / Atom feed
* Raid 5 array down/missing - went through wiki steps
@ 2017-10-28 18:36 Jun-Kai Teoh
  2017-10-28 18:48 ` Mark Knecht
  2017-10-28 22:04 ` Anthony Youngman
  0 siblings, 2 replies; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-10-28 18:36 UTC (permalink / raw)
  To: linux-raid

Hi all,

Hope this email is going to the right place.

I'll cut to the chase - I added a drive to my RAID 5 and was resyncing
when my machine was abruptly powered down. Upon booting it up again,
my RAID array is now missing.

I've followed the instructions that I've found on the wiki, and it
hasn't solved my issues, but it's given me a sense of the things that
I'm hoping can help you guys help me troubleshoot.

My array can't be assembled. It tells me that the superblock on
/dev/sda doesn't match the others.

/dev/sda thinks the array has 7 drives
/dev/sd[bcefghi] thinks the array has 8 drives

/dev/sda was not being reshaped
/dev/sd[bcefghi] has reshape position data in the raid.status file

both /dev/sda and /dev/sdh think their device role is Active device 2

I can't bring /dev/md126 back up with sd[bcefghi] as it'll tell me
that there are 6 drives and 1 rebuilding, not enough to start the
array

My mdadm.conf shows a /dev/dev/127 with very minimal info in it - does
not look right to me.

I haven't zeroed the superblock, nor have I tried a clean-assemble
either. I saw the wiki say I should email the group if I've gotten
that far and I'm panicking and nothing's working. So...

Help me out, pretty please?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 18:36 Raid 5 array down/missing - went through wiki steps Jun-Kai Teoh
@ 2017-10-28 18:48 ` Mark Knecht
  2017-10-28 19:03   ` Jun-Kai Teoh
  2017-10-28 22:04 ` Anthony Youngman
  1 sibling, 1 reply; 16+ messages in thread
From: Mark Knecht @ 2017-10-28 18:48 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Linux-RAID

On Sat, Oct 28, 2017 at 11:36 AM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
> Hi all,
>
> Hope this email is going to the right place.
>
> I'll cut to the chase - I added a drive to my RAID 5 and was resyncing
> when my machine was abruptly powered down. Upon booting it up again,
> my RAID array is now missing.
>
> I've followed the instructions that I've found on the wiki, and it
> hasn't solved my issues, but it's given me a sense of the things that
> I'm hoping can help you guys help me troubleshoot.
>
> My array can't be assembled. It tells me that the superblock on
> /dev/sda doesn't match the others.
>
> /dev/sda thinks the array has 7 drives
> /dev/sd[bcefghi] thinks the array has 8 drives
>
> /dev/sda was not being reshaped
> /dev/sd[bcefghi] has reshape position data in the raid.status file
>
> both /dev/sda and /dev/sdh think their device role is Active device 2
>
> I can't bring /dev/md126 back up with sd[bcefghi] as it'll tell me
> that there are 6 drives and 1 rebuilding, not enough to start the
> array
>
> My mdadm.conf shows a /dev/dev/127 with very minimal info in it - does
> not look right to me.
>
> I haven't zeroed the superblock, nor have I tried a clean-assemble
> either. I saw the wiki say I should email the group if I've gotten
> that far and I'm panicking and nothing's working. So...
>
> Help me out, pretty please?

I wonder if you know for certain that the RAID was scrubbed and clean
before you made the changes?

Maybe sda got kicked before you added the new drive?

The number one thing you'll want to supply is the 'Examine' results
for all drives:

mdadm -E /dev/sd[abcdefghi]

and

mdadm -D for the array if it rill run.

I would also run smartctl on the drives and post those results back also.

I won't be able to solve your problems but those items are pretty much
required for others to get started in a meaningful way.

Good luck,
Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 18:48 ` Mark Knecht
@ 2017-10-28 19:03   ` Jun-Kai Teoh
  2017-10-28 19:10     ` Jun-Kai Teoh
  0 siblings, 1 reply; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-10-28 19:03 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Linux-RAID

/dev/sda:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 7

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 23441323008 (22355.39 GiB 24003.91 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=688 sectors
          State : clean
    Device UUID : 9c14bcb8:be8310f5:5b50c3a7:e6e57423

Internal Bitmap : 8 sectors from superblock
    Update Time : Sun Jan 15 08:09:15 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 13425c44 - correct
         Events : 5667

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 15d85573:6e78f040:8c028ef3:d1301f9d

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Oct 27 19:46:43 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : e568d9d9 - correct
         Events : 650554

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 64d5961e:230e558c:3748b561:a7c6ab8c

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Oct 27 19:46:43 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 39c2485b - correct
         Events : 650554

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 2df0b319:bdb18eee:27b318ec:da55d53d

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Oct 27 19:46:43 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : d2a600e7 - correct
         Events : 650554

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 6
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : f1a790a9:98e01257:d9ab257d:95c8f1fc

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Oct 27 19:46:43 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 1b3052f2 - correct
         Events : 650554

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 6f44eba1:29d246a4:c5e8312e:bac00a7b

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Oct 27 19:46:43 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 8fe4160e - correct
         Events : 650554

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdh:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x47
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
Recovery Offset : 11264816 sectors
          State : active
    Device UUID : 109b7a2f:0529794c:2cf95cc1:d6c0bd6b

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Oct 27 19:46:43 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 4a2fb746 - correct
         Events : 650554

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdi:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
           Name : livingrm-server:2  (local to host livingrm-server)
  Creation Time : Thu Jun 30 07:57:36 2016
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
     Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
  Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
    Data Offset : 262144 sectors
     New Offset : 254976 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : deb04cea:d6530966:6c70ca90:bebb143e

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
  Delta Devices : 1 (7->8)

    Update Time : Fri Oct 27 19:46:43 2017
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : bfaa86d9 - correct
         Events : 650554

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)


and mdadm -D

/dev/md126:
        Version :
     Raid Level : raid0
  Total Devices : 0

          State : inactive

    Number   Major   Minor   RaidDevice

On Sat, Oct 28, 2017 at 11:48 AM, Mark Knecht <markknecht@gmail.com> wrote:
> On Sat, Oct 28, 2017 at 11:36 AM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
>> Hi all,
>>
>> Hope this email is going to the right place.
>>
>> I'll cut to the chase - I added a drive to my RAID 5 and was resyncing
>> when my machine was abruptly powered down. Upon booting it up again,
>> my RAID array is now missing.
>>
>> I've followed the instructions that I've found on the wiki, and it
>> hasn't solved my issues, but it's given me a sense of the things that
>> I'm hoping can help you guys help me troubleshoot.
>>
>> My array can't be assembled. It tells me that the superblock on
>> /dev/sda doesn't match the others.
>>
>> /dev/sda thinks the array has 7 drives
>> /dev/sd[bcefghi] thinks the array has 8 drives
>>
>> /dev/sda was not being reshaped
>> /dev/sd[bcefghi] has reshape position data in the raid.status file
>>
>> both /dev/sda and /dev/sdh think their device role is Active device 2
>>
>> I can't bring /dev/md126 back up with sd[bcefghi] as it'll tell me
>> that there are 6 drives and 1 rebuilding, not enough to start the
>> array
>>
>> My mdadm.conf shows a /dev/dev/127 with very minimal info in it - does
>> not look right to me.
>>
>> I haven't zeroed the superblock, nor have I tried a clean-assemble
>> either. I saw the wiki say I should email the group if I've gotten
>> that far and I'm panicking and nothing's working. So...
>>
>> Help me out, pretty please?
>
> I wonder if you know for certain that the RAID was scrubbed and clean
> before you made the changes?
>
> Maybe sda got kicked before you added the new drive?
>
> The number one thing you'll want to supply is the 'Examine' results
> for all drives:
>
> mdadm -E /dev/sd[abcdefghi]
>
> and
>
> mdadm -D for the array if it rill run.
>
> I would also run smartctl on the drives and post those results back also.
>
> I won't be able to solve your problems but those items are pretty much
> required for others to get started in a meaningful way.
>
> Good luck,
> Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 19:03   ` Jun-Kai Teoh
@ 2017-10-28 19:10     ` Jun-Kai Teoh
  2017-10-28 19:31       ` Mark Knecht
  0 siblings, 1 reply; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-10-28 19:10 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Linux-RAID

The array did have data in it when I added a new drive to it, it
wasn't a blank/empty array. I was trying to expand its size when my
computer went down. I'm really hoping it can be rescued. =/

On Sat, Oct 28, 2017 at 12:03 PM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
> /dev/sda:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 7
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 23441323008 (22355.39 GiB 24003.91 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=262056 sectors, after=688 sectors
>           State : clean
>     Device UUID : 9c14bcb8:be8310f5:5b50c3a7:e6e57423
>
> Internal Bitmap : 8 sectors from superblock
>     Update Time : Sun Jan 15 08:09:15 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 13425c44 - correct
>          Events : 5667
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 2
>    Array State : AAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdb:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x45
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 8
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>      New Offset : 254976 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : 15d85573:6e78f040:8c028ef3:d1301f9d
>
> Internal Bitmap : 8 sectors from superblock
>   Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
>   Delta Devices : 1 (7->8)
>
>     Update Time : Fri Oct 27 19:46:43 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : e568d9d9 - correct
>          Events : 650554
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 3
>    Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdc:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x45
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 8
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>      New Offset : 254976 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : 64d5961e:230e558c:3748b561:a7c6ab8c
>
> Internal Bitmap : 8 sectors from superblock
>   Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
>   Delta Devices : 1 (7->8)
>
>     Update Time : Fri Oct 27 19:46:43 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 39c2485b - correct
>          Events : 650554
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 0
>    Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sde:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x45
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 8
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>      New Offset : 254976 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : 2df0b319:bdb18eee:27b318ec:da55d53d
>
> Internal Bitmap : 8 sectors from superblock
>   Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
>   Delta Devices : 1 (7->8)
>
>     Update Time : Fri Oct 27 19:46:43 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : d2a600e7 - correct
>          Events : 650554
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 6
>    Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdf:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x45
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 8
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>      New Offset : 254976 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : f1a790a9:98e01257:d9ab257d:95c8f1fc
>
> Internal Bitmap : 8 sectors from superblock
>   Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
>   Delta Devices : 1 (7->8)
>
>     Update Time : Fri Oct 27 19:46:43 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 1b3052f2 - correct
>          Events : 650554
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 4
>    Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdg:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x45
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 8
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>      New Offset : 254976 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : 6f44eba1:29d246a4:c5e8312e:bac00a7b
>
> Internal Bitmap : 8 sectors from superblock
>   Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
>   Delta Devices : 1 (7->8)
>
>     Update Time : Fri Oct 27 19:46:43 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 8fe4160e - correct
>          Events : 650554
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 1
>    Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdh:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x47
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 8
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>      New Offset : 254976 sectors
>    Super Offset : 8 sectors
> Recovery Offset : 11264816 sectors
>           State : active
>     Device UUID : 109b7a2f:0529794c:2cf95cc1:d6c0bd6b
>
> Internal Bitmap : 8 sectors from superblock
>   Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
>   Delta Devices : 1 (7->8)
>
>     Update Time : Fri Oct 27 19:46:43 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 4a2fb746 - correct
>          Events : 650554
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 2
>    Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdi:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x45
>      Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>            Name : livingrm-server:2  (local to host livingrm-server)
>   Creation Time : Thu Jun 30 07:57:36 2016
>      Raid Level : raid5
>    Raid Devices : 8
>
>  Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>      Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
>   Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>     Data Offset : 262144 sectors
>      New Offset : 254976 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : deb04cea:d6530966:6c70ca90:bebb143e
>
> Internal Bitmap : 8 sectors from superblock
>   Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
>   Delta Devices : 1 (7->8)
>
>     Update Time : Fri Oct 27 19:46:43 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : bfaa86d9 - correct
>          Events : 650554
>
>          Layout : left-symmetric
>      Chunk Size : 512K
>
>    Device Role : Active device 5
>    Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
>
>
> and mdadm -D
>
> /dev/md126:
>         Version :
>      Raid Level : raid0
>   Total Devices : 0
>
>           State : inactive
>
>     Number   Major   Minor   RaidDevice
>
> On Sat, Oct 28, 2017 at 11:48 AM, Mark Knecht <markknecht@gmail.com> wrote:
>> On Sat, Oct 28, 2017 at 11:36 AM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
>>> Hi all,
>>>
>>> Hope this email is going to the right place.
>>>
>>> I'll cut to the chase - I added a drive to my RAID 5 and was resyncing
>>> when my machine was abruptly powered down. Upon booting it up again,
>>> my RAID array is now missing.
>>>
>>> I've followed the instructions that I've found on the wiki, and it
>>> hasn't solved my issues, but it's given me a sense of the things that
>>> I'm hoping can help you guys help me troubleshoot.
>>>
>>> My array can't be assembled. It tells me that the superblock on
>>> /dev/sda doesn't match the others.
>>>
>>> /dev/sda thinks the array has 7 drives
>>> /dev/sd[bcefghi] thinks the array has 8 drives
>>>
>>> /dev/sda was not being reshaped
>>> /dev/sd[bcefghi] has reshape position data in the raid.status file
>>>
>>> both /dev/sda and /dev/sdh think their device role is Active device 2
>>>
>>> I can't bring /dev/md126 back up with sd[bcefghi] as it'll tell me
>>> that there are 6 drives and 1 rebuilding, not enough to start the
>>> array
>>>
>>> My mdadm.conf shows a /dev/dev/127 with very minimal info in it - does
>>> not look right to me.
>>>
>>> I haven't zeroed the superblock, nor have I tried a clean-assemble
>>> either. I saw the wiki say I should email the group if I've gotten
>>> that far and I'm panicking and nothing's working. So...
>>>
>>> Help me out, pretty please?
>>
>> I wonder if you know for certain that the RAID was scrubbed and clean
>> before you made the changes?
>>
>> Maybe sda got kicked before you added the new drive?
>>
>> The number one thing you'll want to supply is the 'Examine' results
>> for all drives:
>>
>> mdadm -E /dev/sd[abcdefghi]
>>
>> and
>>
>> mdadm -D for the array if it rill run.
>>
>> I would also run smartctl on the drives and post those results back also.
>>
>> I won't be able to solve your problems but those items are pretty much
>> required for others to get started in a meaningful way.
>>
>> Good luck,
>> Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 19:10     ` Jun-Kai Teoh
@ 2017-10-28 19:31       ` Mark Knecht
  2017-10-28 19:42         ` Jun-Kai Teoh
  0 siblings, 1 reply; 16+ messages in thread
From: Mark Knecht @ 2017-10-28 19:31 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Linux-RAID

On Sat, Oct 28, 2017 at 12:10 PM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
> The array did have data in it when I added a new drive to it, it
> wasn't a blank/empty array. I was trying to expand its size when my
> computer went down. I'm really hoping it can be rescued. =/
>
<SNIP>
I understand that the array had data. However, RAID5 can suffer 1 drive
being out of the array and still appear to be working.

1) Prior to adding the drive did you look at

cat /proc/mdstat

and note status of the original 7 drives?

2) Do you scrub your RAID routinely, as in a crontab entry or some
other method?

3) What kind of drives are these. smartctl output will help, ala:

c2RAID6 ~ # smartctl -a /dev/sda
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.12-gentoo] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD30EFRX-68EUZN0

Provide all smartctl output for all 8 drives

Just giving guidance to get you setup for a faster response when folks
come back after the weekend. I am not a RAID expert and won't
supply commands to get it working. At this time do NOT do anything
to force the array back together. More people seem to cause problems
trying to do that before getting input. Better that you wait.

Good luck,
Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 19:31       ` Mark Knecht
@ 2017-10-28 19:42         ` Jun-Kai Teoh
  2017-10-28 21:15           ` Mark Knecht
  0 siblings, 1 reply; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-10-28 19:42 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Linux-RAID

Gotcha, apologies for misunderstanding the question.

1. I faintly remember looking at the status and they should have all
been fine, but I'm not 100% sure.

2. I did not scrub my RAID - don't think I've done it before at all.

3. Thanks for the reminder and providing me with the command, I wasn't
entirely sure which smartctl command to run.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E6CUCC3R
LU WWN Device Id: 5 0014ee 2b69e1ff1
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:34:44 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (50580) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 506) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   223   182   021    Pre-fail
Always       -       5850
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       212
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17743
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       212
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       121
193 Load_Cycle_Count        0x0032   192   192   000    Old_age
Always       -       25990
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   199   129   000    Old_age
Always       -       421
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4K2JPZT
LU WWN Device Id: 5 0014ee 2b510b64f
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:35:00 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (52320) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 523) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   222   179   021    Pre-fail
Always       -       5891
  4 Start_Stop_Count        0x0032   098   098   000    Old_age
Always       -       2624
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   064   064   000    Old_age
Always       -       26647
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       310
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       184
193 Load_Cycle_Count        0x0032   188   188   000    Old_age
Always       -       38893
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   192   000    Old_age
Always       -       14
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4VTX9TP
LU WWN Device Id: 5 0014ee 261485dc4
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:35:03 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (54780) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 548) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   222   181   021    Pre-fail
Always       -       5866
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       210
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17748
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       210
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       121
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       33283
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E2VKP3SV
LU WWN Device Id: 5 0014ee 2620754c8
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:35:06 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (51120) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 512) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   223   183   021    Pre-fail
Always       -       5825
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       180
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   083   083   000    Old_age
Always       -       12934
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       180
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       112
193 Load_Cycle_Count        0x0032   192   192   000    Old_age
Always       -       26411
194 Temperature_Celsius     0x0022   120   101   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4JNHEC6
LU WWN Device Id: 5 0014ee 2614842c3
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:10 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (53940) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 539) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   222   180   021    Pre-fail
Always       -       5875
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       212
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17748
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       212
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       127
193 Load_Cycle_Count        0x0032   190   190   000    Old_age
Always       -       32835
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E2ZYE8AN
LU WWN Device Id: 5 0014ee 20bf26ce9
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:13 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (54960) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 550) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   219   178   021    Pre-fail
Always       -       6016
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       212
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17751
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       212
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       120
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       33470
194 Temperature_Celsius     0x0022   120   098   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68N32N0
Serial Number:    WD-WCC7K7PZ7R6Z
LU WWN Device Id: 5 0014ee 26410129e
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:16 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (46440) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 492) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x303d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   100   253   021    Pre-fail
Always       -       0
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       3
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age
Always       -       11
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       3
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       1
193 Load_Cycle_Count        0x0032   200   200   000    Old_age
Always       -       3
194 Temperature_Celsius     0x0022   121   117   000    Old_age
Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4JNHKL9
LU WWN Device Id: 5 0014ee 2b69e186b
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:19 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (55260) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 552) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       1
  3 Spin_Up_Time            0x0027   220   180   021    Pre-fail
Always       -       5958
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       673
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   075   075   000    Old_age
Always       -       18933
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       261
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       162
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       35904
194 Temperature_Celsius     0x0022   119   091   000    Old_age
Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

On Sat, Oct 28, 2017 at 12:31 PM, Mark Knecht <markknecht@gmail.com> wrote:
> On Sat, Oct 28, 2017 at 12:10 PM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
>> The array did have data in it when I added a new drive to it, it
>> wasn't a blank/empty array. I was trying to expand its size when my
>> computer went down. I'm really hoping it can be rescued. =/
>>
> <SNIP>
> I understand that the array had data. However, RAID5 can suffer 1 drive
> being out of the array and still appear to be working.
>
> 1) Prior to adding the drive did you look at
>
> cat /proc/mdstat
>
> and note status of the original 7 drives?
>
> 2) Do you scrub your RAID routinely, as in a crontab entry or some
> other method?
>
> 3) What kind of drives are these. smartctl output will help, ala:
>
> c2RAID6 ~ # smartctl -a /dev/sda
> smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.12-gentoo] (local build)
> Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org
>
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD30EFRX-68EUZN0
>
> Provide all smartctl output for all 8 drives
>
> Just giving guidance to get you setup for a faster response when folks
> come back after the weekend. I am not a RAID expert and won't
> supply commands to get it working. At this time do NOT do anything
> to force the array back together. More people seem to cause problems
> trying to do that before getting input. Better that you wait.
>
> Good luck,
> Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 19:42         ` Jun-Kai Teoh
@ 2017-10-28 21:15           ` Mark Knecht
  0 siblings, 0 replies; 16+ messages in thread
From: Mark Knecht @ 2017-10-28 21:15 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Linux-RAID

On Sat, Oct 28, 2017 at 12:42 PM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
> Gotcha, apologies for misunderstanding the question.
>
> 1. I faintly remember looking at the status and they should have all
> been fine, but I'm not 100% sure.
>
> 2. I did not scrub my RAID - don't think I've done it before at all.
>
> 3. Thanks for the reminder and providing me with the command, I wasn't
> entirely sure which smartctl command to run.
>
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>
<SNIP>
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (46440) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 492) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x303d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail
> Always       -       0
>   3 Spin_Up_Time            0x0027   100   253   021    Pre-fail
> Always       -       0
>   4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       3
>   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
>   7 Seek_Error_Rate         0x002e   100   253   000    Old_age
> Always       -       0
>   9 Power_On_Hours          0x0032   100   100   000    Old_age
> Always       -       11
>  10 Spin_Retry_Count        0x0032   100   253   000    Old_age
> Always       -       0
>  11 Calibration_Retry_Count 0x0032   100   253   000    Old_age
> Always       -       0
>  12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       3
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       1
> 193 Load_Cycle_Count        0x0032   200   200   000    Old_age
> Always       -       3
> 194 Temperature_Celsius     0x0022   121   117   000    Old_age
> Always       -       29
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
>
> SMART Error Log Version: 1
> No Errors Logged
>
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
>
> SMART Selective self-test log data structure revision number 1
>  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>     1        0        0  Not_testing
>     2        0        0  Not_testing
>     3        0        0  Not_testing
>     4        0        0  Not_testing
>     5        0        0  Not_testing
> Selective self-test flags (0x0):
>   After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
>
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
<SNIP>

OK, I cannot tell (and it doesn't matter right now to me) what order
those outputs are in compared to /dev/sd values. The one I didn't snip
out appears to be the new one with 11 hours of power on time. I note
that one of the older drives had a different firmware version. Again,
very unlikely to be a problem. WD Reds are good drives.

Oh, I see you haven't posted (or I'm not finding it) what distro,
kernel and mdadm version you are using. Folks are certainly going to
ask for that.

At this point I'd personally sit tight and wait for someone capable of
giving guidance. I sort of doubt that you've lost data, or at least
not too much data) but who knows? There have been a couple of threads
recently (< 1 month) about people doing reshapes, and maybe new adds,
where the user had problems. There were instructions given for how to
go back in at least one case. You might read through those threads to
just get a feel for it but wait for real advice. Don't force anything.

Good luck,
Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 18:36 Raid 5 array down/missing - went through wiki steps Jun-Kai Teoh
  2017-10-28 18:48 ` Mark Knecht
@ 2017-10-28 22:04 ` Anthony Youngman
  2017-10-28 22:15   ` Jun-Kai Teoh
  1 sibling, 1 reply; 16+ messages in thread
From: Anthony Youngman @ 2017-10-28 22:04 UTC (permalink / raw)
  To: mdraid

On 28/10/17 19:36, Jun-Kai Teoh wrote:
> Hi all,
> 
> Hope this email is going to the right place.
> 
> I'll cut to the chase - I added a drive to my RAID 5 and was resyncing
> when my machine was abruptly powered down. Upon booting it up again,
> my RAID array is now missing.
>

I've seen Mark's replies, so ...

> I've followed the instructions that I've found on the wiki, and it
> hasn't solved my issues, but it's given me a sense of the things that
> I'm hoping can help you guys help me troubleshoot.
> 
Found where? Did you look at the front page? Did you look at "When 
things go wrogn"?

> My array can't be assembled. It tells me that the superblock on
> /dev/sda doesn't match the others.
> 
> /dev/sda thinks the array has 7 drives
> /dev/sd[bcefghi] thinks the array has 8 drives
> 
The event count tells me sda was kicked out of the array a LONG time ago 
- you were running a degraded array, sorry.

> /dev/sda was not being reshaped
> /dev/sd[bcefghi] has reshape position data in the raid.status file
> 
> both /dev/sda and /dev/sdh think their device role is Active device 2
> 
> I can't bring /dev/md126 back up with sd[bcefghi] as it'll tell me
> that there are 6 drives and 1 rebuilding, not enough to start the
> array
> 
> My mdadm.conf shows a /dev/dev/127 with very minimal info in it - does
> not look right to me.
> 
> I haven't zeroed the superblock, nor have I tried a clean-assemble
> either. I saw the wiki say I should email the group if I've gotten
> that far and I'm panicking and nothing's working. So...
> 
> Help me out, pretty please?

Okay, I *think* you're going to be okay. The powerfail brought the 
machine down, and because the array was degraded, it wouldn't 
re-assemble. Like Mark, I'd wait for the experts to get on the case on 
Monday, but what I think they will advise is

One - --assemble --force [bcdefghi] - note do NOT include the failed 
drive a. This will fire off the reshape again. BUT. On a degraded array 
you have no redundancy!!!

Two - ADD ANOTHER DRIVE TO REPLACE SDA !!!

I don't know how to read the smartctl statistics (and I don't know which 
one is sda!), but if I were you I would fire off a self-test on sda to 
find out whether it's bad or not. It may have been kicked out by a 
harmless glitch, or it may be ready to fail permanently. But be prepared 
to shell out for a replacement. In fact, I'd go out and get another 
drive right now. If sda turns out to be okay, you can go to a 9-drive 
raid-6.

To cut a long story short, I think you've been running with a degraded 
array for a long time. You should be able to force-assemble it no 
problem but you need to fix it asap. And then you should go raid-6 to 
give you a bit extra safety and set up scrubbing! Again, I'll let the 
experts confirm, but I think going from 8-drives-degraded to 
9-drive-raid-6 in one step is probably better than recovering your raid 
5 and then adding another drive to go raid 6.

Just wait for the experts to confirm this and then I think you'll be 
okay. On the good side, you do have proper raid drives - WD Reds :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 22:04 ` Anthony Youngman
@ 2017-10-28 22:15   ` Jun-Kai Teoh
  2017-10-28 23:41     ` Mark Knecht
  2017-10-29  0:18     ` Wols Lists
  0 siblings, 2 replies; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-10-28 22:15 UTC (permalink / raw)
  To: Anthony Youngman; +Cc: mdraid

Thanks for the response Anthony & Mark, really appreciate how helpful
both of you are.

I did try to reassemble last night (before I found the wiki and all)
and it would assemble, but then it'll say it can't bring the array up
6 drives with 1 rebuilding, and the array thinks that there should be
8 drives. Does that mean I'm... screwed?

mdadm version (mdadm --version)
mdadm - v3.3 - 3rd September 2013

kernel (uname -mrsn)
Linux livingrm-server 4.4.0-97-generic x86_64

distro
Ubuntu 16.04 LTS

On Sat, Oct 28, 2017 at 3:04 PM, Anthony Youngman
<antlists@youngman.org.uk> wrote:
> On 28/10/17 19:36, Jun-Kai Teoh wrote:
>>
>> Hi all,
>>
>> Hope this email is going to the right place.
>>
>> I'll cut to the chase - I added a drive to my RAID 5 and was resyncing
>> when my machine was abruptly powered down. Upon booting it up again,
>> my RAID array is now missing.
>>
>
> I've seen Mark's replies, so ...
>
>> I've followed the instructions that I've found on the wiki, and it
>> hasn't solved my issues, but it's given me a sense of the things that
>> I'm hoping can help you guys help me troubleshoot.
>>
> Found where? Did you look at the front page? Did you look at "When things go
> wrogn"?
>
>> My array can't be assembled. It tells me that the superblock on
>> /dev/sda doesn't match the others.
>>
>> /dev/sda thinks the array has 7 drives
>> /dev/sd[bcefghi] thinks the array has 8 drives
>>
> The event count tells me sda was kicked out of the array a LONG time ago -
> you were running a degraded array, sorry.
>
>> /dev/sda was not being reshaped
>> /dev/sd[bcefghi] has reshape position data in the raid.status file
>>
>> both /dev/sda and /dev/sdh think their device role is Active device 2
>>
>> I can't bring /dev/md126 back up with sd[bcefghi] as it'll tell me
>> that there are 6 drives and 1 rebuilding, not enough to start the
>> array
>>
>> My mdadm.conf shows a /dev/dev/127 with very minimal info in it - does
>> not look right to me.
>>
>> I haven't zeroed the superblock, nor have I tried a clean-assemble
>> either. I saw the wiki say I should email the group if I've gotten
>> that far and I'm panicking and nothing's working. So...
>>
>> Help me out, pretty please?
>
>
> Okay, I *think* you're going to be okay. The powerfail brought the machine
> down, and because the array was degraded, it wouldn't re-assemble. Like
> Mark, I'd wait for the experts to get on the case on Monday, but what I
> think they will advise is
>
> One - --assemble --force [bcdefghi] - note do NOT include the failed drive
> a. This will fire off the reshape again. BUT. On a degraded array you have
> no redundancy!!!
>
> Two - ADD ANOTHER DRIVE TO REPLACE SDA !!!
>
> I don't know how to read the smartctl statistics (and I don't know which one
> is sda!), but if I were you I would fire off a self-test on sda to find out
> whether it's bad or not. It may have been kicked out by a harmless glitch,
> or it may be ready to fail permanently. But be prepared to shell out for a
> replacement. In fact, I'd go out and get another drive right now. If sda
> turns out to be okay, you can go to a 9-drive raid-6.
>
> To cut a long story short, I think you've been running with a degraded array
> for a long time. You should be able to force-assemble it no problem but you
> need to fix it asap. And then you should go raid-6 to give you a bit extra
> safety and set up scrubbing! Again, I'll let the experts confirm, but I
> think going from 8-drives-degraded to 9-drive-raid-6 in one step is probably
> better than recovering your raid 5 and then adding another drive to go raid
> 6.
>
> Just wait for the experts to confirm this and then I think you'll be okay.
> On the good side, you do have proper raid drives - WD Reds :-)
>
> Cheers,
> Wol
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 22:15   ` Jun-Kai Teoh
@ 2017-10-28 23:41     ` Mark Knecht
  2017-10-29  0:18     ` Wols Lists
  1 sibling, 0 replies; 16+ messages in thread
From: Mark Knecht @ 2017-10-28 23:41 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Anthony Youngman, mdraid

On Sat, Oct 28, 2017 at 3:15 PM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
> Thanks for the response Anthony & Mark, really appreciate how helpful
> both of you are.
>
> I did try to reassemble last night (before I found the wiki and all)
> and it would assemble, but then it'll say it can't bring the array up
> 6 drives with 1 rebuilding, and the array thinks that there should be
> 8 drives. Does that mean I'm... screwed?
>
> mdadm version (mdadm --version)
> mdadm - v3.3 - 3rd September 2013
>
> kernel (uname -mrsn)
> Linux livingrm-server 4.4.0-97-generic x86_64
>
> distro
> Ubuntu 16.04 LTS
>
> On Sat, Oct 28, 2017 at 3:04 PM, Anthony Youngman
> <antlists@youngman.org.uk> wrote:
>> On 28/10/17 19:36, Jun-Kai Teoh wrote:
>>>
>>> Hi all,
>>>
>>> Hope this email is going to the right place.
>>>
>>> I'll cut to the chase - I added a drive to my RAID 5 and was resyncing
>>> when my machine was abruptly powered down. Upon booting it up again,
>>> my RAID array is now missing.
>>>
>>
>> I've seen Mark's replies, so ...
>>
>>> I've followed the instructions that I've found on the wiki, and it
>>> hasn't solved my issues, but it's given me a sense of the things that
>>> I'm hoping can help you guys help me troubleshoot.
>>>
>> Found where? Did you look at the front page? Did you look at "When things go
>> wrogn"?
>>
>>> My array can't be assembled. It tells me that the superblock on
>>> /dev/sda doesn't match the others.
>>>
>>> /dev/sda thinks the array has 7 drives
>>> /dev/sd[bcefghi] thinks the array has 8 drives
>>>
>> The event count tells me sda was kicked out of the array a LONG time ago -
>> you were running a degraded array, sorry.
>>
>>> /dev/sda was not being reshaped
>>> /dev/sd[bcefghi] has reshape position data in the raid.status file
>>>
>>> both /dev/sda and /dev/sdh think their device role is Active device 2
>>>
>>> I can't bring /dev/md126 back up with sd[bcefghi] as it'll tell me
>>> that there are 6 drives and 1 rebuilding, not enough to start the
>>> array
>>>
>>> My mdadm.conf shows a /dev/dev/127 with very minimal info in it - does
>>> not look right to me.
>>>
>>> I haven't zeroed the superblock, nor have I tried a clean-assemble
>>> either. I saw the wiki say I should email the group if I've gotten
>>> that far and I'm panicking and nothing's working. So...
>>>
>>> Help me out, pretty please?
>>
>>
>> Okay, I *think* you're going to be okay. The powerfail brought the machine
>> down, and because the array was degraded, it wouldn't re-assemble. Like
>> Mark, I'd wait for the experts to get on the case on Monday, but what I
>> think they will advise is
>>
>> One - --assemble --force [bcdefghi] - note do NOT include the failed drive
>> a. This will fire off the reshape again. BUT. On a degraded array you have
>> no redundancy!!!
>>
>> Two - ADD ANOTHER DRIVE TO REPLACE SDA !!!
>>
>> I don't know how to read the smartctl statistics (and I don't know which one
>> is sda!), but if I were you I would fire off a self-test on sda to find out
>> whether it's bad or not. It may have been kicked out by a harmless glitch,
>> or it may be ready to fail permanently. But be prepared to shell out for a
>> replacement. In fact, I'd go out and get another drive right now. If sda
>> turns out to be okay, you can go to a 9-drive raid-6.
>>
>> To cut a long story short, I think you've been running with a degraded array
>> for a long time. You should be able to force-assemble it no problem but you
>> need to fix it asap. And then you should go raid-6 to give you a bit extra
>> safety and set up scrubbing! Again, I'll let the experts confirm, but I
>> think going from 8-drives-degraded to 9-drive-raid-6 in one step is probably
>> better than recovering your raid 5 and then adding another drive to go raid
>> 6.
>>
>> Just wait for the experts to confirm this and then I think you'll be okay.
>> On the good side, you do have proper raid drives - WD Reds :-)
>>
>> Cheers,
>> Wol
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I think Anthony is completely correct. sda dropped out a long time ago. The
event count is very low. Had you been scrubbing your drives on a regular
basis you would likely have discovered this, but had you even looked

cat /proc/mdstat

it would have shown that sda wasn't in the array.

Without sda you would have had (I think) no redundancy in a RAID5. That
said, the array would have continued to work, which it apparently did
because you didn't notice any problems.

NOTE: sda might not be bad - don't throw it away. you might have had
a weird power event or on some reboot it started up late and didn't get
included. From there on it's out of sync and until you take action I don't
think it would ever get added back in automatically, but that doesn't mean
the drive is actually bad. (When bad things happen to good people...) ;-)

The trick now is to find the right set of commands to assemble without
losing data. Anthony is more experienced than me and I have no reason
to distrust his suggestions. Whether a 9-drive RAID6 would fit your enclosure
is another issue. As this array seemed to say 'Living Room' in one form or
another there are issues I'm not clear about. Power consumption, noise
etc.

- Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-28 22:15   ` Jun-Kai Teoh
  2017-10-28 23:41     ` Mark Knecht
@ 2017-10-29  0:18     ` Wols Lists
  2017-10-30 18:08       ` Jun-Kai Teoh
  1 sibling, 1 reply; 16+ messages in thread
From: Wols Lists @ 2017-10-29  0:18 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: mdraid

On 28/10/17 23:15, Jun-Kai Teoh wrote:
> Thanks for the response Anthony & Mark, really appreciate how helpful
> both of you are.
> 
> I did try to reassemble last night (before I found the wiki and all)
> and it would assemble, but then it'll say it can't bring the array up
> 6 drives with 1 rebuilding, and the array thinks that there should be
> 8 drives. Does that mean I'm... screwed?

Nope! Definitely not.

Six drives, one rebuilding, array of 8.

That means you have seven *working* drives out of eight. That's enough.
Get the array working, it will continue rebuilding, and will leave you
with seven drives out of eight. In other words, everything is CURRENTLY
okay. But YOU HAVE NO REDUNDANCY. One more problem, and yes you are
screwed. Which is why I say "wait for the experts". But, hardware
permitting, I think this is an easy situation to recover.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-29  0:18     ` Wols Lists
@ 2017-10-30 18:08       ` Jun-Kai Teoh
  2017-10-30 18:15         ` Reindl Harald
  2017-10-30 18:18         ` Mark Knecht
  0 siblings, 2 replies; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-10-30 18:08 UTC (permalink / raw)
  To: Wols Lists; +Cc: mdraid

Thanks for all the responses, Mark, Anthony and Wol.

I have another hard drive on the way, just in case sda is truly dead.
I had no idea that when a drive "dies" in a RAID5 array - nothing
would notify me. I obviously have much to learn, really appreciate all
the input from folks so far.

I think I've provided the details of my setup (kernel, mdadm ver,
distro, smartctl output, and mdadm -E output) - do let me know if I've
left anything out. Left the machine alone this weekend and did not do
anything with it.

Kai Teoh


On Oct 28, 2017, at 5:18 PM, Wols Lists <antlists@youngman.org.uk> wrote:

On 28/10/17 23:15, Jun-Kai Teoh wrote:

Thanks for the response Anthony & Mark, really appreciate how helpful
both of you are.

I did try to reassemble last night (before I found the wiki and all)
and it would assemble, but then it'll say it can't bring the array up
6 drives with 1 rebuilding, and the array thinks that there should be
8 drives. Does that mean I'm... screwed?


Nope! Definitely not.

Six drives, one rebuilding, array of 8.

That means you have seven *working* drives out of eight. That's enough.
Get the array working, it will continue rebuilding, and will leave you
with seven drives out of eight. In other words, everything is CURRENTLY
okay. But YOU HAVE NO REDUNDANCY. One more problem, and yes you are
screwed. Which is why I say "wait for the experts". But, hardware
permitting, I think this is an easy situation to recover.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-30 18:08       ` Jun-Kai Teoh
@ 2017-10-30 18:15         ` Reindl Harald
  2017-10-30 18:18         ` Mark Knecht
  1 sibling, 0 replies; 16+ messages in thread
From: Reindl Harald @ 2017-10-30 18:15 UTC (permalink / raw)
  To: Jun-Kai Teoh, Wols Lists; +Cc: mdraid



Am 30.10.2017 um 19:08 schrieb Jun-Kai Teoh:
> Thanks for all the responses, Mark, Anthony and Wol.
> 
> I have another hard drive on the way, just in case sda is truly dead.
> I had no idea that when a drive "dies" in a RAID5 array - nothing
> would notify me

on a proper distribution you have mdmon running which sends a mail when 
a drive fails - see below - the problem is that you never tested the 
behavior what happens when you plug out a disk before place data on the 
array, doing so you would have found out how to make sure you get notifies

-------- Weitergeleitete Nachricht --------
Betreff: Fail event on /dev/md0:srv-rhsoft.rhsoft.net
Datum: Wed,  3 Dec 2014 11:32:35 +0100 (CET)
Von: mdadm monitoring <root@srv-rhsoft.rhsoft.net>
An: root@srv-rhsoft.rhsoft.net

This is an automatically generated mail message from mdadm
running on srv-rhsoft.rhsoft.net

A Fail event had been detected on md device /dev/md0.

It could be related to component device /dev/sdb1.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1] [raid10] md2 : active raid10 sda3[4] sdc3[5] 
sdd3[0] sdb3[3]
       3875222528 blocks super 1.1 512K chunks 2 near-copies [4/4] [UUUU]
       bitmap: 1/29 pages [4KB], 65536KB chunk

md1 : active raid10 sda2[4] sdc2[5] sdb2[3] sdd2[0]
       30716928 blocks super 1.1 512K chunks 2 near-copies [4/4] [UUUU]
       bitmap: 1/1 pages [4KB], 65536KB chunk

md0 : active raid1 sda1[4] sdb1[3](F) sdd1[0] sdc1[5]
       511988 blocks super 1.0 [4/3] [UUU_]
       unused devices: <none>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-30 18:08       ` Jun-Kai Teoh
  2017-10-30 18:15         ` Reindl Harald
@ 2017-10-30 18:18         ` Mark Knecht
  2017-11-06 17:10           ` Jun-Kai Teoh
  1 sibling, 1 reply; 16+ messages in thread
From: Mark Knecht @ 2017-10-30 18:18 UTC (permalink / raw)
  To: Jun-Kai Teoh; +Cc: Wols Lists, mdraid

On Mon, Oct 30, 2017 at 11:08 AM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
> Thanks for all the responses, Mark, Anthony and Wol.
>
> I have another hard drive on the way, just in case sda is truly dead.
> I had no idea that when a drive "dies" in a RAID5 array - nothing
> would notify me. I obviously have much to learn, really appreciate all
> the input from folks so far.
>
> I think I've provided the details of my setup (kernel, mdadm ver,
> distro, smartctl output, and mdadm -E output) - do let me know if I've
> left anything out. Left the machine alone this weekend and did not do
> anything with it.
>
> Kai Teoh
>

Kai Teoh,
   I don't think it's fair to say 'nothing will warn you' as there are ways to
get messages about this stuff. More fair to say that different distros will
better automate some of this stuff but it's always up to the sys admin to
ensure the system does what you want.

   Someone will likely give you input in the next day or two. Give folks
a chance to catch up.

Good luck,
Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Raid 5 array down/missing - went through wiki steps
  2017-10-30 18:18         ` Mark Knecht
@ 2017-11-06 17:10           ` Jun-Kai Teoh
  2017-11-09 16:51             ` `Raid " Jun-Kai Teoh
  0 siblings, 1 reply; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-11-06 17:10 UTC (permalink / raw)
  To: mdraid

Hi all,

Soft bumping this thread again. My raid array is still down, and I haven't touched anything. I'll briefly run through the basics:

 - 7x4TB raid 5 array
 - One of the 4TB drives dropped (according to the pros here, I was unaware of it because I didn't have a monitor set up)
 - I added another 4TB drive, making it an 8x4TB array
 - Partway during reshaping, machine lost power. 
 - Brought machine back, tried to bring array up, it doesn't work and says 6 drives, 1 rebuilding
- /dev/sda is the drive that was dropped

 *the mdadm config also changed from md126 to md127, and the new one looks way too little. mdadm config shows md127, but mdadm -D shows md126.

Below are all the stuff I provided the last time since I think it might be buried now. 

In order of appearance
1. mdadm version
2. kernel
3. distro
4. mdadm -E 
5. mdadm -D
6. SMART info

--------
mdadm version (mdadm --version)
mdadm - v3.3 - 3rd September 2013
--------

--------
kernel (uname -mrsn)
Linux livingrm-server 4.4.0-97-generic x86_64
--------

--------
distro
Ubuntu 16.04 LTS
--------

/dev/sda:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x1
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 7

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 23441323008 (22355.39 GiB 24003.91 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
  Super Offset : 8 sectors
  Unused Space : before=262056 sectors, after=688 sectors
         State : clean
   Device UUID : 9c14bcb8:be8310f5:5b50c3a7:e6e57423

Internal Bitmap : 8 sectors from superblock
   Update Time : Sun Jan 15 08:09:15 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : 13425c44 - correct
        Events : 5667

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 2
  Array State : AAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x45
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 8

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
    New Offset : 254976 sectors
  Super Offset : 8 sectors
         State : active
   Device UUID : 15d85573:6e78f040:8c028ef3:d1301f9d

Internal Bitmap : 8 sectors from superblock
 Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
 Delta Devices : 1 (7->8)

   Update Time : Fri Oct 27 19:46:43 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : e568d9d9 - correct
        Events : 650554

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 3
  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x45
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 8

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
    New Offset : 254976 sectors
  Super Offset : 8 sectors
         State : clean
   Device UUID : 64d5961e:230e558c:3748b561:a7c6ab8c

Internal Bitmap : 8 sectors from superblock
 Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
 Delta Devices : 1 (7->8)

   Update Time : Fri Oct 27 19:46:43 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : 39c2485b - correct
        Events : 650554

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 0
  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sde:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x45
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 8

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
    New Offset : 254976 sectors
  Super Offset : 8 sectors
         State : active
   Device UUID : 2df0b319:bdb18eee:27b318ec:da55d53d

Internal Bitmap : 8 sectors from superblock
 Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
 Delta Devices : 1 (7->8)

   Update Time : Fri Oct 27 19:46:43 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : d2a600e7 - correct
        Events : 650554

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 6
  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdf:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x45
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 8

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
    New Offset : 254976 sectors
  Super Offset : 8 sectors
         State : active
   Device UUID : f1a790a9:98e01257:d9ab257d:95c8f1fc

Internal Bitmap : 8 sectors from superblock
 Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
 Delta Devices : 1 (7->8)

   Update Time : Fri Oct 27 19:46:43 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : 1b3052f2 - correct
        Events : 650554

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 4
  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x45
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 8

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
    New Offset : 254976 sectors
  Super Offset : 8 sectors
         State : active
   Device UUID : 6f44eba1:29d246a4:c5e8312e:bac00a7b

Internal Bitmap : 8 sectors from superblock
 Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
 Delta Devices : 1 (7->8)

   Update Time : Fri Oct 27 19:46:43 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : 8fe4160e - correct
        Events : 650554

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 1
  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdh:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x47
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 8

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
    New Offset : 254976 sectors
  Super Offset : 8 sectors
Recovery Offset : 11264816 sectors
         State : active
   Device UUID : 109b7a2f:0529794c:2cf95cc1:d6c0bd6b

Internal Bitmap : 8 sectors from superblock
 Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
 Delta Devices : 1 (7->8)

   Update Time : Fri Oct 27 19:46:43 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : 4a2fb746 - correct
        Events : 650554

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 2
  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdi:
         Magic : a92b4efc
       Version : 1.2
   Feature Map : 0x45
    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
          Name : livingrm-server:2  (local to host livingrm-server)
 Creation Time : Thu Jun 30 07:57:36 2016
    Raid Level : raid5
  Raid Devices : 8

Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
 Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
   Data Offset : 262144 sectors
    New Offset : 254976 sectors
  Super Offset : 8 sectors
         State : active
   Device UUID : deb04cea:d6530966:6c70ca90:bebb143e

Internal Bitmap : 8 sectors from superblock
 Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
 Delta Devices : 1 (7->8)

   Update Time : Fri Oct 27 19:46:43 2017
 Bad Block Log : 512 entries available at offset 72 sectors
      Checksum : bfaa86d9 - correct
        Events : 650554

        Layout : left-symmetric
    Chunk Size : 512K

  Device Role : Active device 5
  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)


--------
/dev/md126:
       Version :
    Raid Level : raid0
 Total Devices : 0

         State : inactive

   Number   Major   Minor   RaidDevice
--------

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E6CUCC3R
LU WWN Device Id: 5 0014ee 2b69e1ff1
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:34:44 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (50580) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 506) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
 3 Spin_Up_Time            0x0027   223   182   021    Pre-fail
Always       -       5850
 4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       212
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17743
10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       212
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       121
193 Load_Cycle_Count        0x0032   192   192   000    Old_age
Always       -       25990
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   199   129   000    Old_age
Always       -       421
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4K2JPZT
LU WWN Device Id: 5 0014ee 2b510b64f
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:35:00 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (52320) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 523) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
 3 Spin_Up_Time            0x0027   222   179   021    Pre-fail
Always       -       5891
 4 Start_Stop_Count        0x0032   098   098   000    Old_age
Always       -       2624
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   064   064   000    Old_age
Always       -       26647
10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       310
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       184
193 Load_Cycle_Count        0x0032   188   188   000    Old_age
Always       -       38893
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   192   000    Old_age
Always       -       14
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4VTX9TP
LU WWN Device Id: 5 0014ee 261485dc4
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:35:03 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (54780) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 548) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
 3 Spin_Up_Time            0x0027   222   181   021    Pre-fail
Always       -       5866
 4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       210
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17748
10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       210
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       121
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       33283
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E2VKP3SV
LU WWN Device Id: 5 0014ee 2620754c8
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Oct 28 12:35:06 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (51120) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 512) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
 3 Spin_Up_Time            0x0027   223   183   021    Pre-fail
Always       -       5825
 4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       180
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   083   083   000    Old_age
Always       -       12934
10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       180
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       112
193 Load_Cycle_Count        0x0032   192   192   000    Old_age
Always       -       26411
194 Temperature_Celsius     0x0022   120   101   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4JNHEC6
LU WWN Device Id: 5 0014ee 2614842c3
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:10 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (53940) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 539) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
 3 Spin_Up_Time            0x0027   222   180   021    Pre-fail
Always       -       5875
 4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       212
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17748
10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       212
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       127
193 Load_Cycle_Count        0x0032   190   190   000    Old_age
Always       -       32835
194 Temperature_Celsius     0x0022   120   092   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E2ZYE8AN
LU WWN Device Id: 5 0014ee 20bf26ce9
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:13 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (54960) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 550) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
 3 Spin_Up_Time            0x0027   219   178   021    Pre-fail
Always       -       6016
 4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       212
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   076   076   000    Old_age
Always       -       17751
10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       212
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       120
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       33470
194 Temperature_Celsius     0x0022   120   098   000    Old_age
Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68N32N0
Serial Number:    WD-WCC7K7PZ7R6Z
LU WWN Device Id: 5 0014ee 26410129e
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:16 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (46440) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 492) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x303d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail
Always       -       0
 3 Spin_Up_Time            0x0027   100   253   021    Pre-fail
Always       -       0
 4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       3
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   100   253   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   100   100   000    Old_age
Always       -       11
10 Spin_Retry_Count        0x0032   100   253   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       3
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       1
193 Load_Cycle_Count        0x0032   200   200   000    Old_age
Always       -       3
194 Temperature_Celsius     0x0022   121   117   000    Old_age
Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E4JNHKL9
LU WWN Device Id: 5 0014ee 2b69e186b
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Oct 28 12:35:19 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (55260) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: ( 552) minutes.
Conveyance self-test routine
recommended polling time: (   5) minutes.
SCT capabilities:        (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       1
 3 Spin_Up_Time            0x0027   220   180   021    Pre-fail
Always       -       5958
 4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       673
 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
 9 Power_On_Hours          0x0032   075   075   000    Old_age
Always       -       18933
10 Spin_Retry_Count        0x0032   100   100   000    Old_age
Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       261
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       162
193 Load_Cycle_Count        0x0032   189   189   000    Old_age
Always       -       35904
194 Temperature_Celsius     0x0022   119   091   000    Old_age
Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
   1        0        0  Not_testing
   2        0        0  Not_testing
   3        0        0  Not_testing
   4        0        0  Not_testing
   5        0        0  Not_testing
Selective self-test flags (0x0):
 After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Kai Teoh


> On Oct 30, 2017, at 11:18 AM, Mark Knecht <markknecht@gmail.com> wrote:
> 
> On Mon, Oct 30, 2017 at 11:08 AM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
>> Thanks for all the responses, Mark, Anthony and Wol.
>> 
>> I have another hard drive on the way, just in case sda is truly dead.
>> I had no idea that when a drive "dies" in a RAID5 array - nothing
>> would notify me. I obviously have much to learn, really appreciate all
>> the input from folks so far.
>> 
>> I think I've provided the details of my setup (kernel, mdadm ver,
>> distro, smartctl output, and mdadm -E output) - do let me know if I've
>> left anything out. Left the machine alone this weekend and did not do
>> anything with it.
>> 
>> Kai Teoh
>> 
> 
> Kai Teoh,
>   I don't think it's fair to say 'nothing will warn you' as there are ways to
> get messages about this stuff. More fair to say that different distros will
> better automate some of this stuff but it's always up to the sys admin to
> ensure the system does what you want.
> 
>   Someone will likely give you input in the next day or two. Give folks
> a chance to catch up.
> 
> Good luck,
> Mark


^ permalink raw reply	[flat|nested] 16+ messages in thread

* `Raid 5 array down/missing - went through wiki steps
  2017-11-06 17:10           ` Jun-Kai Teoh
@ 2017-11-09 16:51             ` Jun-Kai Teoh
  0 siblings, 0 replies; 16+ messages in thread
From: Jun-Kai Teoh @ 2017-11-09 16:51 UTC (permalink / raw)
  To: mdraid

Hi all,

Bump?

Kai Teoh

> On Nov 6, 2017, at 9:10 AM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
> 
> Hi all,
> 
> Soft bumping this thread again. My raid array is still down, and I haven't touched anything. I'll briefly run through the basics:
> 
> - 7x4TB raid 5 array
> - One of the 4TB drives dropped (according to the pros here, I was unaware of it because I didn't have a monitor set up)
> - I added another 4TB drive, making it an 8x4TB array
> - Partway during reshaping, machine lost power. 
> - Brought machine back, tried to bring array up, it doesn't work and says 6 drives, 1 rebuilding
> - /dev/sda is the drive that was dropped
> 
> *the mdadm config also changed from md126 to md127, and the new one looks way too little. mdadm config shows md127, but mdadm -D shows md126.
> 
> Below are all the stuff I provided the last time since I think it might be buried now. 
> 
> In order of appearance
> 1. mdadm version
> 2. kernel
> 3. distro
> 4. mdadm -E 
> 5. mdadm -D
> 6. SMART info
> 
> --------
> mdadm version (mdadm --version)
> mdadm - v3.3 - 3rd September 2013
> --------
> 
> --------
> kernel (uname -mrsn)
> Linux livingrm-server 4.4.0-97-generic x86_64
> --------
> 
> --------
> distro
> Ubuntu 16.04 LTS
> --------
> 
> /dev/sda:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x1
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 7
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 23441323008 (22355.39 GiB 24003.91 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>  Super Offset : 8 sectors
>  Unused Space : before=262056 sectors, after=688 sectors
>         State : clean
>   Device UUID : 9c14bcb8:be8310f5:5b50c3a7:e6e57423
> 
> Internal Bitmap : 8 sectors from superblock
>   Update Time : Sun Jan 15 08:09:15 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : 13425c44 - correct
>        Events : 5667
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 2
>  Array State : AAAAAAA ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdb:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x45
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 8
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>    New Offset : 254976 sectors
>  Super Offset : 8 sectors
>         State : active
>   Device UUID : 15d85573:6e78f040:8c028ef3:d1301f9d
> 
> Internal Bitmap : 8 sectors from superblock
> Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
> Delta Devices : 1 (7->8)
> 
>   Update Time : Fri Oct 27 19:46:43 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : e568d9d9 - correct
>        Events : 650554
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 3
>  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdc:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x45
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 8
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>    New Offset : 254976 sectors
>  Super Offset : 8 sectors
>         State : clean
>   Device UUID : 64d5961e:230e558c:3748b561:a7c6ab8c
> 
> Internal Bitmap : 8 sectors from superblock
> Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
> Delta Devices : 1 (7->8)
> 
>   Update Time : Fri Oct 27 19:46:43 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : 39c2485b - correct
>        Events : 650554
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 0
>  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sde:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x45
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 8
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>    New Offset : 254976 sectors
>  Super Offset : 8 sectors
>         State : active
>   Device UUID : 2df0b319:bdb18eee:27b318ec:da55d53d
> 
> Internal Bitmap : 8 sectors from superblock
> Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
> Delta Devices : 1 (7->8)
> 
>   Update Time : Fri Oct 27 19:46:43 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : d2a600e7 - correct
>        Events : 650554
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 6
>  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdf:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x45
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 8
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>    New Offset : 254976 sectors
>  Super Offset : 8 sectors
>         State : active
>   Device UUID : f1a790a9:98e01257:d9ab257d:95c8f1fc
> 
> Internal Bitmap : 8 sectors from superblock
> Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
> Delta Devices : 1 (7->8)
> 
>   Update Time : Fri Oct 27 19:46:43 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : 1b3052f2 - correct
>        Events : 650554
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 4
>  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdg:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x45
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 8
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>    New Offset : 254976 sectors
>  Super Offset : 8 sectors
>         State : active
>   Device UUID : 6f44eba1:29d246a4:c5e8312e:bac00a7b
> 
> Internal Bitmap : 8 sectors from superblock
> Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
> Delta Devices : 1 (7->8)
> 
>   Update Time : Fri Oct 27 19:46:43 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : 8fe4160e - correct
>        Events : 650554
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 1
>  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdh:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x47
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 8
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>    New Offset : 254976 sectors
>  Super Offset : 8 sectors
> Recovery Offset : 11264816 sectors
>         State : active
>   Device UUID : 109b7a2f:0529794c:2cf95cc1:d6c0bd6b
> 
> Internal Bitmap : 8 sectors from superblock
> Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
> Delta Devices : 1 (7->8)
> 
>   Update Time : Fri Oct 27 19:46:43 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : 4a2fb746 - correct
>        Events : 650554
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 2
>  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdi:
>         Magic : a92b4efc
>       Version : 1.2
>   Feature Map : 0x45
>    Array UUID : f7333d4f:8300969d:55148d64:93c8afc8
>          Name : livingrm-server:2  (local to host livingrm-server)
> Creation Time : Thu Jun 30 07:57:36 2016
>    Raid Level : raid5
>  Raid Devices : 8
> 
> Avail Dev Size : 7813775024 (3725.90 GiB 4000.65 GB)
>    Array Size : 27348210176 (26081.29 GiB 28004.57 GB)
> Used Dev Size : 7813774336 (3725.90 GiB 4000.65 GB)
>   Data Offset : 262144 sectors
>    New Offset : 254976 sectors
>  Super Offset : 8 sectors
>         State : active
>   Device UUID : deb04cea:d6530966:6c70ca90:bebb143e
> 
> Internal Bitmap : 8 sectors from superblock
> Reshape pos'n : 39348736 (37.53 GiB 40.29 GB)
> Delta Devices : 1 (7->8)
> 
>   Update Time : Fri Oct 27 19:46:43 2017
> Bad Block Log : 512 entries available at offset 72 sectors
>      Checksum : bfaa86d9 - correct
>        Events : 650554
> 
>        Layout : left-symmetric
>    Chunk Size : 512K
> 
>  Device Role : Active device 5
>  Array State : AAAAAAA. ('A' == active, '.' == missing, 'R' == replacing)
> 
> 
> --------
> /dev/md126:
>       Version :
>    Raid Level : raid0
> Total Devices : 0
> 
>         State : inactive
> 
>   Number   Major   Minor   RaidDevice
> --------
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68WT0N0
> Serial Number:    WD-WCC4E6CUCC3R
> LU WWN Device Id: 5 0014ee 2b69e1ff1
> Firmware Version: 82.00A82
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2 (minor revision not indicated)
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
> Local Time is:    Sat Oct 28 12:34:44 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (50580) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 506) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x703d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       0
> 3 Spin_Up_Time            0x0027   223   182   021    Pre-fail
> Always       -       5850
> 4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       212
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   076   076   000    Old_age
> Always       -       17743
> 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       212
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       121
> 193 Load_Cycle_Count        0x0032   192   192   000    Old_age
> Always       -       25990
> 194 Temperature_Celsius     0x0022   120   092   000    Old_age
> Always       -       32
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   199   129   000    Old_age
> Always       -       421
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68WT0N0
> Serial Number:    WD-WCC4E4K2JPZT
> LU WWN Device Id: 5 0014ee 2b510b64f
> Firmware Version: 80.00A80
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2 (minor revision not indicated)
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
> Local Time is:    Sat Oct 28 12:35:00 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (52320) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 523) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x703d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       0
> 3 Spin_Up_Time            0x0027   222   179   021    Pre-fail
> Always       -       5891
> 4 Start_Stop_Count        0x0032   098   098   000    Old_age
> Always       -       2624
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   064   064   000    Old_age
> Always       -       26647
> 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       310
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       184
> 193 Load_Cycle_Count        0x0032   188   188   000    Old_age
> Always       -       38893
> 194 Temperature_Celsius     0x0022   120   092   000    Old_age
> Always       -       32
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   192   000    Old_age
> Always       -       14
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68WT0N0
> Serial Number:    WD-WCC4E4VTX9TP
> LU WWN Device Id: 5 0014ee 261485dc4
> Firmware Version: 82.00A82
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2 (minor revision not indicated)
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
> Local Time is:    Sat Oct 28 12:35:03 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (54780) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 548) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x703d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       0
> 3 Spin_Up_Time            0x0027   222   181   021    Pre-fail
> Always       -       5866
> 4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       210
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   076   076   000    Old_age
> Always       -       17748
> 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       210
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       121
> 193 Load_Cycle_Count        0x0032   189   189   000    Old_age
> Always       -       33283
> 194 Temperature_Celsius     0x0022   120   092   000    Old_age
> Always       -       32
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68WT0N0
> Serial Number:    WD-WCC4E2VKP3SV
> LU WWN Device Id: 5 0014ee 2620754c8
> Firmware Version: 82.00A82
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2 (minor revision not indicated)
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
> Local Time is:    Sat Oct 28 12:35:06 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (51120) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 512) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x703d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       0
> 3 Spin_Up_Time            0x0027   223   183   021    Pre-fail
> Always       -       5825
> 4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       180
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   083   083   000    Old_age
> Always       -       12934
> 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       180
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       112
> 193 Load_Cycle_Count        0x0032   192   192   000    Old_age
> Always       -       26411
> 194 Temperature_Celsius     0x0022   120   101   000    Old_age
> Always       -       32
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68WT0N0
> Serial Number:    WD-WCC4E4JNHEC6
> LU WWN Device Id: 5 0014ee 2614842c3
> Firmware Version: 82.00A82
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2 (minor revision not indicated)
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
> Local Time is:    Sat Oct 28 12:35:10 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (53940) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 539) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x703d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       0
> 3 Spin_Up_Time            0x0027   222   180   021    Pre-fail
> Always       -       5875
> 4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       212
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   076   076   000    Old_age
> Always       -       17748
> 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       212
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       127
> 193 Load_Cycle_Count        0x0032   190   190   000    Old_age
> Always       -       32835
> 194 Temperature_Celsius     0x0022   120   092   000    Old_age
> Always       -       32
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68WT0N0
> Serial Number:    WD-WCC4E2ZYE8AN
> LU WWN Device Id: 5 0014ee 20bf26ce9
> Firmware Version: 82.00A82
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2 (minor revision not indicated)
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
> Local Time is:    Sat Oct 28 12:35:13 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (54960) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 550) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x703d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       0
> 3 Spin_Up_Time            0x0027   219   178   021    Pre-fail
> Always       -       6016
> 4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       212
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   076   076   000    Old_age
> Always       -       17751
> 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       212
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       120
> 193 Load_Cycle_Count        0x0032   189   189   000    Old_age
> Always       -       33470
> 194 Temperature_Celsius     0x0022   120   098   000    Old_age
> Always       -       32
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68N32N0
> Serial Number:    WD-WCC7K7PZ7R6Z
> LU WWN Device Id: 5 0014ee 26410129e
> Firmware Version: 82.00A82
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Form Factor:      3.5 inches
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-3 T13/2161-D revision 5
> SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
> Local Time is:    Sat Oct 28 12:35:16 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (46440) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 492) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x303d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail
> Always       -       0
> 3 Spin_Up_Time            0x0027   100   253   021    Pre-fail
> Always       -       0
> 4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       3
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   100   253   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   100   100   000    Old_age
> Always       -       11
> 10 Spin_Retry_Count        0x0032   100   253   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       3
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       1
> 193 Load_Cycle_Count        0x0032   200   200   000    Old_age
> Always       -       3
> 194 Temperature_Celsius     0x0022   121   117   000    Old_age
> Always       -       29
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-97-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Western Digital Red
> Device Model:     WDC WD40EFRX-68WT0N0
> Serial Number:    WD-WCC4E4JNHKL9
> LU WWN Device Id: 5 0014ee 2b69e186b
> Firmware Version: 82.00A82
> User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    5400 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ACS-2 (minor revision not indicated)
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
> Local Time is:    Sat Oct 28 12:35:19 2017 PDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection: Disabled.
> Self-test execution status:      (   0) The previous self-test routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (55260) seconds.
> Offline data collection
> capabilities: (0x7b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: (   2) minutes.
> Extended self-test routine
> recommended polling time: ( 552) minutes.
> Conveyance self-test routine
> recommended polling time: (   5) minutes.
> SCT capabilities:        (0x703d) SCT Status supported.
> SCT Error Recovery Control supported.
> SCT Feature Control supported.
> SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       1
> 3 Spin_Up_Time            0x0027   220   180   021    Pre-fail
> Always       -       5958
> 4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       673
> 5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
> Always       -       0
> 7 Seek_Error_Rate         0x002e   200   200   000    Old_age
> Always       -       0
> 9 Power_On_Hours          0x0032   075   075   000    Old_age
> Always       -       18933
> 10 Spin_Retry_Count        0x0032   100   100   000    Old_age
> Always       -       0
> 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age
> Always       -       0
> 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       261
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
> Always       -       162
> 193 Load_Cycle_Count        0x0032   189   189   000    Old_age
> Always       -       35904
> 194 Temperature_Celsius     0x0022   119   091   000    Old_age
> Always       -       33
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age
> Offline      -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> SMART Selective self-test log data structure revision number 1
> SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>   1        0        0  Not_testing
>   2        0        0  Not_testing
>   3        0        0  Not_testing
>   4        0        0  Not_testing
>   5        0        0  Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 
> Kai Teoh
> 
> 
>> On Oct 30, 2017, at 11:18 AM, Mark Knecht <markknecht@gmail.com> wrote:
>> 
>> On Mon, Oct 30, 2017 at 11:08 AM, Jun-Kai Teoh <kai.teoh@gmail.com> wrote:
>>> Thanks for all the responses, Mark, Anthony and Wol.
>>> 
>>> I have another hard drive on the way, just in case sda is truly dead.
>>> I had no idea that when a drive "dies" in a RAID5 array - nothing
>>> would notify me. I obviously have much to learn, really appreciate all
>>> the input from folks so far.
>>> 
>>> I think I've provided the details of my setup (kernel, mdadm ver,
>>> distro, smartctl output, and mdadm -E output) - do let me know if I've
>>> left anything out. Left the machine alone this weekend and did not do
>>> anything with it.
>>> 
>>> Kai Teoh
>>> 
>> 
>> Kai Teoh,
>>  I don't think it's fair to say 'nothing will warn you' as there are ways to
>> get messages about this stuff. More fair to say that different distros will
>> better automate some of this stuff but it's always up to the sys admin to
>> ensure the system does what you want.
>> 
>>  Someone will likely give you input in the next day or two. Give folks
>> a chance to catch up.
>> 
>> Good luck,
>> Mark
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-11-09 16:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-28 18:36 Raid 5 array down/missing - went through wiki steps Jun-Kai Teoh
2017-10-28 18:48 ` Mark Knecht
2017-10-28 19:03   ` Jun-Kai Teoh
2017-10-28 19:10     ` Jun-Kai Teoh
2017-10-28 19:31       ` Mark Knecht
2017-10-28 19:42         ` Jun-Kai Teoh
2017-10-28 21:15           ` Mark Knecht
2017-10-28 22:04 ` Anthony Youngman
2017-10-28 22:15   ` Jun-Kai Teoh
2017-10-28 23:41     ` Mark Knecht
2017-10-29  0:18     ` Wols Lists
2017-10-30 18:08       ` Jun-Kai Teoh
2017-10-30 18:15         ` Reindl Harald
2017-10-30 18:18         ` Mark Knecht
2017-11-06 17:10           ` Jun-Kai Teoh
2017-11-09 16:51             ` `Raid " Jun-Kai Teoh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox