Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Bogo Mipps <bogo.mipps@gmail.com>
To: linux-raid@vger.kernel.org
Subject: Advice please re failed Raid6
Date: Sun, 16 Jul 2017 11:40:22 +1200	[thread overview]
Message-ID: <9dca5b7a-b60e-0e93-41fd-49d092d8b27b@gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 8466 bytes --]

Hi List

I posted this to the Open Media Vault (my NAS O/S) list a few days ago 
without response. A definitive answer would be appreciated.

Have been running a 4 disk Raid 6 setup for over two years without any 
issues, until suddenly on June 27 disks on my OMV NAS became 100% full, 
including the NFS mounted volumes in the raid set. There'd been major
disk activity overnight, but foolishly didn't investigate.

Rsnapshot normally backs up two desktop machines onto the raid setup: 
the next morning found that one of the backup directories was not on 
raid but suddenly was on the root directory of the OMV machine, and
being a total backup (several Gb) this accounted for the 100% reading 
for the OMV/NAS machine.

The logs indicated that mdstat had discovered "dirty degraded array" 
presumably due to faulty sdb, so had withdrawn that disk, and then 
couldn't run the raid set (logs below show)

Bought a new disk and installed on July 4, and raid rebuilt overnight 
(see July 5 Rebuild finished below)

Since then I've been unable to mount or access any data. Have followed 
instructions as per Linux Raid Wiki's "Recovering a failed software 
RAID" & "RAID Recovery" , but still no success. I've attached the 
results of their suggestions in the attached log file 
"linux_raid_wiki_logs.txt".

Any help appreciated - even if it's just to tell me my RAID sets are hosed!

P.S. This line looks ominous? <md0: detected capacity change from 
4000528203776 to 0> !!!

===============Jun 27 16:52:21 keruru kernel: [ 2.912440] md: md0 stopped.
Jun 27 16:52:21 keruru kernel: [ 2.922315] md: bind<sdb>
Jun 27 16:52:21 keruru kernel: [ 2.922508] md: bind<sdc>
Jun 27 16:52:21 keruru kernel: [ 2.922643] md: bind<sde>
Jun 27 16:52:21 keruru kernel: [ 2.922777] md: bind<sdd>
Jun 27 16:52:21 keruru kernel: [ 2.922808] md: kicking non-fresh sdb 
from array!
Jun 27 16:52:21 keruru kernel: [ 2.922820] md: unbind<sdb>
Jun 27 16:52:21 keruru kernel: [ 2.927107] md: export_rdev(sdb)
Jun 27 16:52:21 keruru kernel: [ 2.994973] raid6: sse2x1 588 MB/s
Jun 27 16:52:21 keruru kernel: [ 3.062926] raid6: sse2x2 1395 MB/s
Jun 27 16:52:21 keruru kernel: [ 3.130841] raid6: sse2x4 2397 MB/s
Jun 27 16:52:21 keruru kernel: [ 3.130844] raid6: using algorithm sse2x4 
(2397 MB/s)
Jun 27 16:52:21 keruru kernel: [ 3.130846] raid6: using ssse3x2 recovery 
algorithm
Jun 27 16:52:21 keruru kernel: [ 3.130866] Switched to clocksource tsc
Jun 27 16:52:21 keruru kernel: [ 3.131227] xor: automatically using best 
checksumming function:
Jun 27 16:52:21 keruru kernel: [ 3.170797] avx : 6164.000 MB/sec
Jun 27 16:52:21 keruru kernel: [ 3.171121] async_tx: api initialized (async)
Jun 27 16:52:21 keruru kernel: [ 3.172809] md: raid6 personality 
registered for level 6
Jun 27 16:52:21 keruru kernel: [ 3.172812] md: raid5 personality 
registered for level 5
Jun 27 16:52:21 keruru kernel: [ 3.172815] md: raid4 personality 
registered for level 4
Jun 27 16:52:21 keruru kernel: [ 3.173218] md/raid:md0: not clean -- 
starting background reconstruction
Jun 27 16:52:21 keruru kernel: [ 3.173236] md/raid:md0: device sdd 
operational as raid disk 1
Jun 27 16:52:21 keruru kernel: [ 3.173239] md/raid:md0: device sde 
operational as raid disk 3
Jun 27 16:52:21 keruru kernel: [ 3.173242] md/raid:md0: device sdc 
operational as raid disk 2
Jun 27 16:52:21 keruru kernel: [ 3.173706] md/raid:md0: allocated 0kB
Jun 27 16:52:21 keruru kernel: [ 3.173745] md/raid:md0: cannot start 
dirty degraded array.
Jun 27 16:52:21 keruru kernel: [ 3.173811] RAID conf printout:
Jun 27 16:52:21 keruru kernel: [ 3.173814] --- level:6 rd:4 wd:3
Jun 27 16:52:21 keruru kernel: [ 3.173816] disk 1, o:1, dev:sdd
Jun 27 16:52:21 keruru kernel: [ 3.173818] disk 2, o:1, dev:sdc
Jun 27 16:52:21 keruru kernel: [ 3.173820] disk 3, o:1, dev:sde
Jun 27 16:52:21 keruru kernel: [ 3.174025] md/raid:md0: failed to run 
raid set.
Jun 27 16:52:21 keruru kernel: [ 3.174071] md: pers->run() failed ...
===============
New disk added - sdb
===============
Jul 5 21:06:18 keruru mdadm[2497]: RebuildFinished event detected on md 
device /dev/md0, component device mismatches found: 1847058224 (on raid 
level 6)
Jul 6 09:45:52 keruru kernel: [ 1195.390879] raid6: sse2x1 249 MB/s
Jul 6 09:45:52 keruru kernel: [ 1195.458735] raid6: sse2x2 476 MB/s
Jul 6 09:45:52 keruru kernel: [ 1195.526632] raid6: sse2x4 839 MB/s
Jul 6 09:45:52 keruru kernel: [ 1195.526638] raid6: using algorithm 
sse2x4 (839 MB/s)
Jul 6 09:45:52 keruru kernel: [ 1195.526644] raid6: using ssse3x2 
recovery algorithm
Jul 6 09:45:52 keruru kernel: [ 1195.578970] md: raid6 personality 
registered for level 6
Jul 6 09:45:52 keruru kernel: [ 1195.578980] md: raid5 personality 
registered for level 5
Jul 6 09:45:52 keruru kernel: [ 1195.578985] md: raid4 personality 
registered for level 4
Jul 6 09:45:52 keruru kernel: [ 1195.580003] md/raid:md0: device sdb 
operational as raid disk 0
Jul 6 09:45:52 keruru kernel: [ 1195.580012] md/raid:md0: device sde 
operational as raid disk 3
Jul 6 09:45:52 keruru kernel: [ 1195.580018] md/raid:md0: device sdd 
operational as raid disk 2
Jul 6 09:45:52 keruru kernel: [ 1195.580025] md/raid:md0: device sdc 
operational as raid disk 1
Jul 6 09:45:52 keruru kernel: [ 1195.581091] md/raid:md0: allocated 0kB
Jul 6 09:45:52 keruru kernel: [ 1195.581180] md/raid:md0: raid level 6 
active with 4 out of 4 devices, algorithm 2
Jul 6 09:52:30 keruru kernel: [ 4.186106] raid6: sse2x1 602 MB/s
Jul 6 09:52:30 keruru kernel: [ 4.254006] raid6: sse2x2 906 MB/s
Jul 6 09:52:30 keruru kernel: [ 4.186106] raid6: sse2x1 602 MB/s
Jul 6 09:52:30 keruru kernel: [ 4.254006] raid6: sse2x2 906 MB/s
Jul 6 09:52:30 keruru kernel: [ 4.321957] raid6: sse2x4 1130 MB/s
Jul 6 09:52:30 keruru kernel: [ 4.321964] raid6: using algorithm sse2x4 
(1130 MB/s)
Jul 6 09:52:30 keruru kernel: [ 4.321967] raid6: using ssse3x2 recovery 
algorithm
Jul 6 09:52:30 keruru kernel: [ 4.368478] md: raid6 personality 
registered for level 6
Jul 6 09:52:30 keruru kernel: [ 4.368486] md: raid5 personality 
registered for level 5
Jul 6 09:52:30 keruru kernel: [ 4.368490] md: raid4 personality 
registered for level 4
Jul 6 09:52:30 keruru kernel: [ 4.369179] md/raid:md0: device sdb 
operational as raid disk 0
Jul 6 09:52:30 keruru kernel: [ 4.369185] md/raid:md0: device sde 
operational as raid disk 3
Jul 6 09:52:30 keruru kernel: [ 4.369189] md/raid:md0: device sdd 
operational as raid disk 2
Jul 6 09:52:30 keruru kernel: [ 4.369194] md/raid:md0: device sdc 
operational as raid disk 1
Jul 6 09:52:30 keruru kernel: [ 4.369974] md/raid:md0: allocated 0kB
Jul 6 09:52:30 keruru kernel: [ 4.372062] md/raid:md0: raid level 6 
active with 4 out of 4 devices, algorithm 2
Jul 6 12:56:15 keruru kernel: [ 4.442184] raid6: sse2x1 739 MB/s
Jul 6 12:56:15 keruru kernel: [ 4.510060] raid6: sse2x2 1480 MB/s
Jul 6 12:56:15 keruru kernel: [ 4.577985] raid6: sse2x4 1605 MB/s
Jul 6 12:56:15 keruru kernel: [ 4.577993] raid6: using algorithm sse2x4 
(1605 MB/s)
Jul 6 12:56:15 keruru kernel: [ 4.577997] raid6: using ssse3x2 recovery 
algorithm
Jul 6 12:56:15 keruru kernel: [ 4.622570] md: raid6 personality 
registered for level 6
Jul 6 12:56:15 keruru kernel: [ 4.622577] md: raid5 personality 
registered for level 5
Jul 6 12:56:15 keruru kernel: [ 4.622580] md: raid4 personality 
registered for level 4
Jul 6 12:56:15 keruru kernel: [ 4.623261] md/raid:md0: device sdb 
operational as raid disk 0
Jul 6 12:56:15 keruru kernel: [ 4.623266] md/raid:md0: device sde 
operational as raid disk 3
Jul 6 12:56:15 keruru kernel: [ 4.623269] md/raid:md0: device sdd 
operational as raid disk 2
Jul 6 12:56:15 keruru kernel: [ 4.623273] md/raid:md0: device sdc 
operational as raid disk 1
Jul 6 12:56:15 keruru kernel: [ 4.624064] md/raid:md0: allocated 0kB
Jul 6 12:56:15 keruru kernel: [ 4.624131] md/raid:md0: raid level 6 
active with 4 out of 4 devices, algorithm 2
Jul 6 16:54:43 keruru kernel: [14401.858429] md/raid:md0: device sdb 
operational as raid disk 0
Jul 6 16:54:43 keruru kernel: [14401.858442] md/raid:md0: device sde 
operational as raid disk 3
Jul 6 16:54:43 keruru kernel: [14401.858449] md/raid:md0: device sdd 
operational as raid disk 2
Jul 6 16:54:43 keruru kernel: [14401.858455] md/raid:md0: device sdc 
operational as raid disk 1
Jul 6 16:54:43 keruru kernel: [14401.859915] md/raid:md0: allocated 0kB
Jul 6 16:54:43 keruru kernel: [14401.860000] md/raid:md0: raid level 6 
active with 4 out of 4 devices, algorithm 2

[-- Attachment #2: linux_raid_wiki_logs.txt --]
[-- Type: text/plain, Size: 7819 bytes --]

root@keruru:/var/log# mdadm --examine /dev/sd[bedc] >> raid.status
root@keruru:/var/log# cat raid.status 
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : b1e6af5d:e5848ebe:63727445:2ab99719
           Name : keruru:0  (local to host keruru)
  Creation Time : Fri Jun 30 15:42:27 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
     Array Size : 3906765824 (3725.78 GiB 4000.53 GB)
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 79e4933f:dfe5923f:5ba03ae7:3efe38eb

    Update Time : Wed Jul  5 21:06:18 2017
       Checksum : 9ff2b025 - correct
         Events : 119

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : b1e6af5d:e5848ebe:63727445:2ab99719
           Name : keruru:0  (local to host keruru)
  Creation Time : Fri Jun 30 15:42:27 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
     Array Size : 3906765824 (3725.78 GiB 4000.53 GB)
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f1e1a946:711886a6:2604780f:8eba4a2d

    Update Time : Wed Jul  5 21:06:18 2017
       Checksum : 784b0046 - correct
         Events : 119

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : b1e6af5d:e5848ebe:63727445:2ab99719
           Name : keruru:0  (local to host keruru)
  Creation Time : Fri Jun 30 15:42:27 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
     Array Size : 3906765824 (3725.78 GiB 4000.53 GB)
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : cf3bc8a7:9feed87d:945d8e77:08f7f32d

    Update Time : Wed Jul  5 21:06:18 2017
       Checksum : 197bc63c - correct
         Events : 119

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : b1e6af5d:e5848ebe:63727445:2ab99719
           Name : keruru:0  (local to host keruru)
  Creation Time : Fri Jun 30 15:42:27 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
     Array Size : 3906765824 (3725.78 GiB 4000.53 GB)
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b3323d81:279b7c7b:a0c534ed:46d0e6fc

    Update Time : Wed Jul  5 21:06:18 2017
       Checksum : 352daaf4 - correct
         Events : 119

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing)
================
root@keruru:/var/log# mdadm --examine /dev/sd[bedc] | egrep 'Event|/dev/sd'
/dev/sdb:
         Events : 119
/dev/sdc:
         Events : 119
/dev/sdd:
         Events : 119
/dev/sde:
         Events : 119
===============
root@keruru:/var/log# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@keruru:/var/log# mdadm --assemble --force /dev/md0 /dev/sdb /dev/sde /dev/sdd /dev/sdc
mdadm: /dev/md0 has been started with 4 drives.
root@keruru:/var/log# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md0 : active (auto-read-only) raid6 sdb[4] sde[3] sdd[2] sdc[1]
      3906765824 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU]
      
unused devices: <none>
===============
root@keruru:/var/log# grep Role raid.status
   Device Role : Active device 0
   Device Role : Active device 1
   Device Role : Active device 2
   Device Role : Active device 3
===============
root@keruru:/var/log# grep Used raid.status
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
  Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
===============
root@keruru:/var/log# mdadm --create --assume-clean --level=6 --raid-devices=4 --size=1953382912 /dev/md0 /dev/sdb /dev/sde /dev/sdd /dev/sdc
mdadm: /dev/sdb appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
mdadm: partition table exists on /dev/sdb but will be lost or
       meaningless after creating array
mdadm: /dev/sde appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
mdadm: /dev/sdd appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
mdadm: /dev/sdc appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
Continue creating array? n
mdadm: create aborted.
===============
root@keruru:/var/log# mdadm --create --assume-clean --level=6 --raid-devices=4 --size=1953382912 /dev/md0 /dev/sdb /dev/sde /dev/sdd /dev/sdc
mdadm: /dev/sdb appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
mdadm: partition table exists on /dev/sdb but will be lost or
       meaningless after creating array
mdadm: /dev/sde appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
mdadm: /dev/sdd appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
mdadm: /dev/sdc appears to be part of a raid array:
    level=raid6 devices=4 ctime=Fri Jun 30 15:42:27 2017
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
===============
root@keruru:/# mount -t ext4 /dev/md0 /mnt/md0
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
===============
root@keruru:/# dmesg | tail
[448318.800806]  --- level:6 rd:4 wd:4
[448318.800812]  disk 0, o:1, dev:sdb
[448318.800817]  disk 1, o:1, dev:sde
[448318.800822]  disk 2, o:1, dev:sdd
[448318.800827]  disk 3, o:1, dev:sdc
[448318.800951] md0: detected capacity change from 0 to 4000528203776
[448318.809375]  md0: unknown partition table
[448358.704189] EXT4-fs (md0): Unrecognized mount option "\x08" or missing value
[448358.706680] EXT4-fs (md0): failed to parse options in superblock: \x08
[448358.706690] EXT4-fs (md0): Number of reserved GDT blocks insanely large: 9216
=============== 
Jul 11 17:23:15 keruru kernel: [447719.812775] md: export_rdev(sdd)
Jul 11 17:23:19 keruru kernel: [447724.396327] md: md0 stopped.
Jul 11 17:23:19 keruru kernel: [447724.400278] md: bind<sdc>
Jul 11 17:32:29 keruru kernel: [448273.001687] md0: detected capacity change from 4000528203776 to 0
Jul 11 17:32:29 keruru kernel: [448273.001714] md: md0 stopped.
Jul 11 17:32:29 keruru kernel: [448273.001729] md: unbind<sdc>
Jul 11 17:32:29 keruru kernel: [448273.022972] md: export_rdev(sdc)
Jul 11 17:32:29 keruru kernel: [448273.023143] md: unbind<sde>
Jul 11 17:32:29 keruru kernel: [448273.054889] md: export_rdev(sde)
Jul 11 17:32:29 keruru kernel: [448273.055035] md: unbind<sdd>
Jul 11 17:32:29 keruru kernel: [448273.086870] md: export_rdev(sdd)
Jul 11 17:33:14 keruru kernel: [448318.800827]  disk 3, o:1, dev:sdc
===============


             reply	other threads:[~2017-07-15 23:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-15 23:40 Bogo Mipps [this message]
2017-07-16  0:58 ` Advice please re failed Raid6 Roman Mamedov
2017-07-17  0:19 ` Peter Grandi
2017-07-19  1:52   ` Bogo Mipps
2017-07-19 12:36     ` Peter Grandi
2017-07-20  3:59       ` Bogo Mipps
     [not found]       ` <cf9aac00-91b3-3cb5-bceb-df5d7113b933@gmail.com>
2017-07-21  0:44         ` Bogo Mipps
2017-07-21  9:48           ` Peter Grandi
2017-07-23  0:13             ` Bogo Mipps

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9dca5b7a-b60e-0e93-41fd-49d092d8b27b@gmail.com \
    --to=bogo.mipps@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox