Help, array corrupted after clean shutdown.

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Help, array corrupted after clean shutdown.
@ 2013-04-06 11:24 Oliver Schinagl
  2013-04-06 11:58 ` Mikael Abrahamsson
       [not found] ` <CACj=ugTsNd87z4Uq_KdZa_HJYFNTtxwZJ76bv0GNHUj8D66YTA@mail.gmail.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-06 11:24 UTC (permalink / raw)
  To: linux-raid

Hi,

I've had a powerfailure today, to which my UPS responded nicely and made 
my server shutdown normally. One would expect everything is well, right? 
The array, as far as I know, was operating without problems before the 
shutdown, all 4 devices where normally online. mdadm sends me an e-mail 
if something is wrong, so does smartctl.

First thing I noticed that I had 2 (S) drives for /dev/md101. I thus 
started examining things. First I thought that it was some mdadm 
weirdness, where it failed to assemble the drive with all components.
mdadm -A /dev/md101 /dev/sd[cdef]1 failed and gave the same result. 
Something was really wrong.

I checked and compared the output of mdadm --examine on all drives (like 
-Evvvs below) and found that /dev/sdc1's events count was wrong.
/dev/sdf1 and /dev/sdd1 matched (and later sde1 too, but more on that in 
a sec). So sdc1 may have been dropped from the array without me knowing 
it, unlikely put possible. The odd thing is the huge difference in event 
counts, but all four are marked as ACTIVE.

So then onto sde1; why was it failing on that. The gpt table was 
completly gone. 00000. Gone. I used hexdump to examine the drive 
further, and at 0x00041000 there was the mdraid table, as one would 
expect. Good, so it looks like only the gpt has been wiped for some 
misterious reason. Re-creating the gpt quickly revealed mdadm's 
information was still correct (as can be seen below).

So ignore sdc1 and assemble the drive as is should be fine? Right? No.
mdadm -A /dev/md101 /dev/sd[def]1 worked without error.

I always do a fsck before and after a reboot (unless of course I can't 
do the shutdown fsck) and verify /proc/mdadm after a boot. So before 
mounting, as always, I tried to run fsck /dev/md101 -C -; but that came 
up with tons of errors. I didn't fix anything and aborted.

And here we are now. I can't just copy the entire disk (1.5TB per disk) 
and 'experiment', I don't have 4 spare disks. The first thing I would 
want to try is is mdadm -A /dev/sd[cdf]1 --force (leave out the possibly 
corrupted sde1) and see what that does.


All that said when I did the assemble with the 'guessed' 3 correct 
drives. Did of course increase the events count. sdc1 of course didn't 
partake in this. Assuming that it is in sync with the rest, what is the 
worst that can happen? And does the --read-only flag protect against it?


Linux riley 3.7.4-gentoo #2 SMP Tue Feb 5 16:20:59 CET 2013 x86_64 AMD 
Phenom(tm) II X4 905e Processor AuthenticAMD GNU/Linux

riley tmp # mdadm --version
mdadm - v3.1.4 - 31st August 2010


riley tmp # mdadm -Evvvvs
/dev/sdf1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 97877935:04c16c5f:0746cb98:63bffb4c

     Update Time : Sat Apr  6 11:46:03 2013
        Checksum : b585717a - correct
          Events : 512993

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 1
    Array State : AA.A ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdf.
/dev/sde1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 776 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 3f48d5a8:e3ee47a1:23c8b895:addd3dd0

     Update Time : Sat Apr  6 11:46:03 2013
        Checksum : eaec006b - correct
          Events : 512993

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 3
    Array State : AA.A ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sde.
/dev/sdd1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 236f6c48:2a1bcf6b:a7d7d861:53950637

     Update Time : Sat Apr  6 11:46:03 2013
        Checksum : 87f31abb - correct
          Events : 512993

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 0
    Array State : AA.A ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdd.
/dev/sdc1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : active
     Device UUID : 3ce8e262:ad864aee:9055af9b:6cbfd47f

     Update Time : Sat Mar 16 20:20:47 2013
        Checksum : a7686a57 - correct
          Events : 180132

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 2
    Array State : AAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdc.


Before I assembled the array for the first time (mdadm -A /dev/md101 
/dev/sdd1 /dev/sde1 /dev/sdf1), this is how it looked like:
So identical to the above, wtih the exception of the number of events.

riley tmp # mdadm --examine /dev/sde1
/dev/sde1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 776 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 3f48d5a8:e3ee47a1:23c8b895:addd3dd0

     Update Time : Sat Apr  6 09:44:30 2013
        Checksum : eaebe3ea - correct
          Events : 512989

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 3
    Array State : AA.A ('A' == active, '.' == missing)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help, array corrupted after clean shutdown.
  2013-04-06 11:24 Help, array corrupted after clean shutdown Oliver Schinagl
@ 2013-04-06 11:58 ` Mikael Abrahamsson
  2013-04-06 12:04   ` Oliver Schinagl
       [not found] ` <CACj=ugTsNd87z4Uq_KdZa_HJYFNTtxwZJ76bv0GNHUj8D66YTA@mail.gmail.com>
  1 sibling, 1 reply; 9+ messages in thread
From: Mikael Abrahamsson @ 2013-04-06 11:58 UTC (permalink / raw)
  To: Oliver Schinagl; +Cc: linux-raid

On Sat, 6 Apr 2013, Oliver Schinagl wrote:

> All that said when I did the assemble with the 'guessed' 3 correct drives. 
> Did of course increase the events count. sdc1 of course didn't partake in 
> this. Assuming that it is in sync with the rest, what is the worst that can 
> happen? And does the --read-only flag protect against it?

> /dev/sdc1:
>    Update Time : Sat Mar 16 20:20:47 2013
>       Checksum : a7686a57 - correct
>         Events : 180132

As you probably already know, using sdc1 will mean you'll be playing with 
part of the array that is out of date by 3 weeks (this update time 
indicates that sdc1 fell out on Mar 16).

So I would definitely stay away from sdc1. It seems you have made a lot of 
changes lately (your create time is in 2010 and on Mar 16 2013 you were up 
to 180k events, and then now three weeks later the rest of the drives are 
at 513k events, that seems like a of writes has been done to the array the 
past tree weeks?).

So sdc1 is definitely not in sync according to this information. Using it 
is risky, but as a last resort, sure.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help, array corrupted after clean shutdown.
  2013-04-06 11:58 ` Mikael Abrahamsson
@ 2013-04-06 12:04   ` Oliver Schinagl
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-06 12:04 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On 04/06/13 13:58, Mikael Abrahamsson wrote:
> On Sat, 6 Apr 2013, Oliver Schinagl wrote:
>
>> All that said when I did the assemble with the 'guessed' 3 correct
>> drives. Did of course increase the events count. sdc1 of course didn't
>> partake in this. Assuming that it is in sync with the rest, what is
>> the worst that can happen? And does the --read-only flag protect
>> against it?
>
>> /dev/sdc1:
>>    Update Time : Sat Mar 16 20:20:47 2013
>>       Checksum : a7686a57 - correct
>>         Events : 180132
>
> As you probably already know, using sdc1 will mean you'll be playing
> with part of the array that is out of date by 3 weeks (this update time
> indicates that sdc1 fell out on Mar 16).
>
> So I would definitely stay away from sdc1. It seems you have made a lot
> of changes lately (your create time is in 2010 and on Mar 16 2013 you
> were up to 180k events, and then now three weeks later the rest of the
> drives are at 513k events, that seems like a of writes has been done to
> the array the past tree weeks?).
>
> So sdc1 is definitely not in sync according to this information. Using
> it is risky, but as a last resort, sure.
>
While I agree with your findings, I seriously doubt that it did actually 
fall out of the array. Logs do not indicate this fact and I'm not so 
sure there was a huge difference in activity since mar 16th.

Though I will admit I have been cloing, checkout/in a lot of git repo's 
so that could increase this of course, but 3x more then its entire lifetime?

Anyway, last resort sounds where this will be heading, I'm scare of 
running fsck on the array, fixing things and then still have something 
unusable, and even the last resort method not working. So is this all 
doable, in read-only mode. And what are the chances of accessing certain 
data even when doing so?

^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <CACj=ugTsNd87z4Uq_KdZa_HJYFNTtxwZJ76bv0GNHUj8D66YTA@mail.gmail.com>]

* Re: Help, array corrupted after clean shutdown.
       [not found] ` <CACj=ugTsNd87z4Uq_KdZa_HJYFNTtxwZJ76bv0GNHUj8D66YTA@mail.gmail.com>
@ 2013-04-06 15:14   ` Oliver Schinagl
       [not found]     ` <CACj=ugSH2YBrePTKy3e36H4fcHpKQ8ywxrJoLJwbqtbvOR+pEQ@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-06 15:14 UTC (permalink / raw)
  To: Durval Menezes; +Cc: Linux RAID

On 04/06/13 17:06, Durval Menezes wrote:
> Oliver,
>
> What file system? LVM or direct on the MD device?
Sorry, should have mentioned this.

I have 4 1.5 TB sata drives, connected to the onboard sata controller.

I have made 1 GPT partition ontop of each drive and then made a raid5 
array ontop of those devices:

md101 : active (read-only) raid5 sdd1[0] sde1[4] sdf1[1]
       4395413760 blocks super 1.2 level 5, 256k chunk, algorithm 2 
[4/3] [UU_U]

I then formatted /dev/md101 with ext4.

Tune2fs still happily runs on /dev/md101, but of course that doesn't 
mean anything.

riley tmp # tune2fs -l /dev/md101
tune2fs 1.42 (29-Nov-2011)
Filesystem volume name:   data01
Last mounted on:          /tank/01
Filesystem UUID:          9c812d61-96ce-4b71-9763-b77e8b9618d1
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index 
filetype extent flex_bg sparse_super large_file huge_file uninit_bg 
dir_nlink extra_isize
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         not clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              274718720
Block count:              1098853440
Reserved block count:     0
Free blocks:              228693396
Free inodes:              274387775
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      762
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
RAID stride:              64
RAID stripe width:        192
Flex block group size:    16
Filesystem created:       Wed Apr 28 16:42:58 2010
Last mount time:          Tue May  4 17:14:48 2010
Last write time:          Sat Apr  6 11:45:57 2013
Mount count:              10
Maximum mount count:      32
Last checked:             Wed Apr 28 16:42:58 2010
Check interval:           15552000 (6 months)
Next check after:         Mon Oct 25 16:42:58 2010
Lifetime writes:          3591 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       17
Default directory hash:   half_md4
Directory Hash Seed:      f1248a94-5a6a-4e4a-af8a-68b019d13ef6
Journal backup:           inode blocks


>
> --
>     Durval.
>
> On Apr 6, 2013 8:23 AM, "Oliver Schinagl" <oliver+list@schinagl.nl
> <mailto:oliver%2Blist@schinagl.nl>> wrote:
>
>     Hi,
>
>     I've had a powerfailure today, to which my UPS responded nicely and
>     made my server shutdown normally. One would expect everything is
>     well, right? The array, as far as I know, was operating without
>     problems before the shutdown, all 4 devices where normally online.
>     mdadm sends me an e-mail if something is wrong, so does smartctl.
>
>     First thing I noticed that I had 2 (S) drives for /dev/md101. I thus
>     started examining things. First I thought that it was some mdadm
>     weirdness, where it failed to assemble the drive with all components.
>     mdadm -A /dev/md101 /dev/sd[cdef]1 failed and gave the same result.
>     Something was really wrong.
>
>     I checked and compared the output of mdadm --examine on all drives
>     (like -Evvvs below) and found that /dev/sdc1's events count was wrong.
>     /dev/sdf1 and /dev/sdd1 matched (and later sde1 too, but more on
>     that in a sec). So sdc1 may have been dropped from the array without
>     me knowing it, unlikely put possible. The odd thing is the huge
>     difference in event counts, but all four are marked as ACTIVE.
>
>     So then onto sde1; why was it failing on that. The gpt table was
>     completly gone. 00000. Gone. I used hexdump to examine the drive
>     further, and at 0x00041000 there was the mdraid table, as one would
>     expect. Good, so it looks like only the gpt has been wiped for some
>     misterious reason. Re-creating the gpt quickly revealed mdadm's
>     information was still correct (as can be seen below).
>
>     So ignore sdc1 and assemble the drive as is should be fine? Right? No.
>     mdadm -A /dev/md101 /dev/sd[def]1 worked without error.
>
>     I always do a fsck before and after a reboot (unless of course I
>     can't do the shutdown fsck) and verify /proc/mdadm after a boot. So
>     before mounting, as always, I tried to run fsck /dev/md101 -C -; but
>     that came up with tons of errors. I didn't fix anything and aborted.
>
>     And here we are now. I can't just copy the entire disk (1.5TB per
>     disk) and 'experiment', I don't have 4 spare disks. The first thing
>     I would want to try is is mdadm -A /dev/sd[cdf]1 --force (leave out
>     the possibly corrupted sde1) and see what that does.
>
>
>     All that said when I did the assemble with the 'guessed' 3 correct
>     drives. Did of course increase the events count. sdc1 of course
>     didn't partake in this. Assuming that it is in sync with the rest,
>     what is the worst that can happen? And does the --read-only flag
>     protect against it?
>
>
>     Linux riley 3.7.4-gentoo #2 SMP Tue Feb 5 16:20:59 CET 2013 x86_64
>     AMD Phenom(tm) II X4 905e Processor AuthenticAMD GNU/Linux
>
>     riley tmp # mdadm --version
>     mdadm - v3.1.4 - 31st August 2010
>
>
>     riley tmp # mdadm -Evvvvs
>     /dev/sdf1:
>                Magic : a92b4efc
>              Version : 1.2
>          Feature Map : 0x0
>           Array UUID : 2becc012:2d317133:2447784c:__1aab300d
>                 Name : riley:data01  (local to host riley)
>        Creation Time : Tue Apr 27 18:03:37 2010
>           Raid Level : raid5
>         Raid Devices : 4
>
>       Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
>           Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>        Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>          Data Offset : 272 sectors
>         Super Offset : 8 sectors
>                State : clean
>          Device UUID : 97877935:04c16c5f:0746cb98:__63bffb4c
>
>          Update Time : Sat Apr  6 11:46:03 2013
>             Checksum : b585717a - correct
>               Events : 512993
>
>               Layout : left-symmetric
>           Chunk Size : 256K
>
>         Device Role : Active device 1
>         Array State : AA.A ('A' == active, '.' == missing)
>     mdadm: No md superblock detected on /dev/sdf.
>     /dev/sde1:
>                Magic : a92b4efc
>              Version : 1.2
>          Feature Map : 0x0
>           Array UUID : 2becc012:2d317133:2447784c:__1aab300d
>                 Name : riley:data01  (local to host riley)
>        Creation Time : Tue Apr 27 18:03:37 2010
>           Raid Level : raid5
>         Raid Devices : 4
>
>       Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
>           Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>        Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>          Data Offset : 776 sectors
>         Super Offset : 8 sectors
>                State : clean
>          Device UUID : 3f48d5a8:e3ee47a1:23c8b895:__addd3dd0
>
>          Update Time : Sat Apr  6 11:46:03 2013
>             Checksum : eaec006b - correct
>               Events : 512993
>
>               Layout : left-symmetric
>           Chunk Size : 256K
>
>         Device Role : Active device 3
>         Array State : AA.A ('A' == active, '.' == missing)
>     mdadm: No md superblock detected on /dev/sde.
>     /dev/sdd1:
>                Magic : a92b4efc
>              Version : 1.2
>          Feature Map : 0x0
>           Array UUID : 2becc012:2d317133:2447784c:__1aab300d
>                 Name : riley:data01  (local to host riley)
>        Creation Time : Tue Apr 27 18:03:37 2010
>           Raid Level : raid5
>         Raid Devices : 4
>
>       Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
>           Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>        Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>          Data Offset : 272 sectors
>         Super Offset : 8 sectors
>                State : clean
>          Device UUID : 236f6c48:2a1bcf6b:a7d7d861:__53950637
>
>          Update Time : Sat Apr  6 11:46:03 2013
>             Checksum : 87f31abb - correct
>               Events : 512993
>
>               Layout : left-symmetric
>           Chunk Size : 256K
>
>         Device Role : Active device 0
>         Array State : AA.A ('A' == active, '.' == missing)
>     mdadm: No md superblock detected on /dev/sdd.
>     /dev/sdc1:
>                Magic : a92b4efc
>              Version : 1.2
>          Feature Map : 0x0
>           Array UUID : 2becc012:2d317133:2447784c:__1aab300d
>                 Name : riley:data01  (local to host riley)
>        Creation Time : Tue Apr 27 18:03:37 2010
>           Raid Level : raid5
>         Raid Devices : 4
>
>       Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
>           Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>        Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>          Data Offset : 272 sectors
>         Super Offset : 8 sectors
>                State : active
>          Device UUID : 3ce8e262:ad864aee:9055af9b:__6cbfd47f
>
>          Update Time : Sat Mar 16 20:20:47 2013
>             Checksum : a7686a57 - correct
>               Events : 180132
>
>               Layout : left-symmetric
>           Chunk Size : 256K
>
>         Device Role : Active device 2
>         Array State : AAAA ('A' == active, '.' == missing)
>     mdadm: No md superblock detected on /dev/sdc.
>
>
>     Before I assembled the array for the first time (mdadm -A /dev/md101
>     /dev/sdd1 /dev/sde1 /dev/sdf1), this is how it looked like:
>     So identical to the above, wtih the exception of the number of events.
>
>     riley tmp # mdadm --examine /dev/sde1
>     /dev/sde1:
>                Magic : a92b4efc
>              Version : 1.2
>          Feature Map : 0x0
>           Array UUID : 2becc012:2d317133:2447784c:__1aab300d
>                 Name : riley:data01  (local to host riley)
>        Creation Time : Tue Apr 27 18:03:37 2010
>           Raid Level : raid5
>         Raid Devices : 4
>
>       Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
>           Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>        Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>          Data Offset : 776 sectors
>         Super Offset : 8 sectors
>                State : clean
>          Device UUID : 3f48d5a8:e3ee47a1:23c8b895:__addd3dd0
>
>          Update Time : Sat Apr  6 09:44:30 2013
>             Checksum : eaebe3ea - correct
>               Events : 512989
>
>               Layout : left-symmetric
>           Chunk Size : 256K
>
>         Device Role : Active device 3
>         Array State : AA.A ('A' == active, '.' == missing)
>
>     --
>     To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>     the body of a message to majordomo@vger.kernel.org
>     <mailto:majordomo@vger.kernel.org>
>     More majordomo info at http://vger.kernel.org/__majordomo-info.html
>     <http://vger.kernel.org/majordomo-info.html>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <CACj=ugSH2YBrePTKy3e36H4fcHpKQ8ywxrJoLJwbqtbvOR+pEQ@mail.gmail.com>]

* Re: Help, array corrupted after clean shutdown.
       [not found]     ` <CACj=ugSH2YBrePTKy3e36H4fcHpKQ8ywxrJoLJwbqtbvOR+pEQ@mail.gmail.com>
@ 2013-04-06 18:01       ` Oliver Schinagl
       [not found]         ` <CACj=ugQR6hjw0qchJiOtgyWd8VRGs_pkZCBXHbQwjrKFz4u=Xg@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-06 18:01 UTC (permalink / raw)
  To: Durval Menezes; +Cc: Linux RAID

On 04/06/13 19:44, Durval Menezes wrote:
> Hi Oliver,
>
> Seems most of your problems are filesystem corruption (the extN family
> is well known for lack of robustness).
>
> I would try to mount the filesystem read-only (without fsck) and copy
> off as much data as possible... Then fsck and try to copy the rest.
>
> Good luck.
It fails to mount ;)

How can I ensure that the array is not corrupt however (while degraded)? 
At least that way, I can try my luck with ext4 tools.

>
> --
>     Durval.
>
> On Apr 6, 2013 12:13 PM, "Oliver Schinagl" <oliver+list@schinagl.nl
> <mailto:oliver%2Blist@schinagl.nl>> wrote:
>
>     On 04/06/13 17:06, Durval Menezes wrote:
>
>         Oliver,
>
>         What file system? LVM or direct on the MD device?
>
>     Sorry, should have mentioned this.
>
>     I have 4 1.5 TB sata drives, connected to the onboard sata controller.
>
>     I have made 1 GPT partition ontop of each drive and then made a
>     raid5 array ontop of those devices:
>
>     md101 : active (read-only) raid5 sdd1[0] sde1[4] sdf1[1]
>            4395413760 blocks super 1.2 level 5, 256k chunk, algorithm 2
>     [4/3] [UU_U]
>
>     I then formatted /dev/md101 with ext4.
>
>     Tune2fs still happily runs on /dev/md101, but of course that doesn't
>     mean anything.
>
>     riley tmp # tune2fs -l /dev/md101
>     tune2fs 1.42 (29-Nov-2011)
>     Filesystem volume name:   data01
>     Last mounted on:          /tank/01
>     Filesystem UUID:          9c812d61-96ce-4b71-9763-__b77e8b9618d1
>     Filesystem magic number:  0xEF53
>     Filesystem revision #:    1 (dynamic)
>     Filesystem features:      has_journal ext_attr resize_inode
>     dir_index filetype extent flex_bg sparse_super large_file huge_file
>     uninit_bg dir_nlink extra_isize
>     Filesystem flags:         signed_directory_hash
>     Default mount options:    (none)
>     Filesystem state:         not clean
>     Errors behavior:          Continue
>     Filesystem OS type:       Linux
>     Inode count:              274718720
>     Block count:              1098853440
>     Reserved block count:     0
>     Free blocks:              228693396
>     Free inodes:              274387775
>     First block:              0
>     Block size:               4096
>     Fragment size:            4096
>     Reserved GDT blocks:      762
>     Blocks per group:         32768
>     Fragments per group:      32768
>     Inodes per group:         8192
>     Inode blocks per group:   512
>     RAID stride:              64
>     RAID stripe width:        192
>     Flex block group size:    16
>     Filesystem created:       Wed Apr 28 16:42:58 2010
>     Last mount time:          Tue May  4 17:14:48 2010
>     Last write time:          Sat Apr  6 11:45:57 2013
>     Mount count:              10
>     Maximum mount count:      32
>     Last checked:             Wed Apr 28 16:42:58 2010
>     Check interval:           15552000 (6 months)
>     Next check after:         Mon Oct 25 16:42:58 2010
>     Lifetime writes:          3591 GB
>     Reserved blocks uid:      0 (user root)
>     Reserved blocks gid:      0 (group root)
>     First inode:              11
>     Inode size:               256
>     Required extra isize:     28
>     Desired extra isize:      28
>     Journal inode:            8
>     First orphan inode:       17
>     Default directory hash:   half_md4
>     Directory Hash Seed:      f1248a94-5a6a-4e4a-af8a-__68b019d13ef6
>     Journal backup:           inode blocks
>
>
>
>         --
>              Durval.
>
>         On Apr 6, 2013 8:23 AM, "Oliver Schinagl"
>         <oliver+list@schinagl.nl <mailto:oliver%2Blist@schinagl.nl>
>         <mailto:oliver%2Blist@__schinagl.nl
>         <mailto:oliver%252Blist@schinagl.nl>>> wrote:
>
>              Hi,
>
>              I've had a powerfailure today, to which my UPS responded
>         nicely and
>              made my server shutdown normally. One would expect
>         everything is
>              well, right? The array, as far as I know, was operating without
>              problems before the shutdown, all 4 devices where normally
>         online.
>              mdadm sends me an e-mail if something is wrong, so does
>         smartctl.
>
>              First thing I noticed that I had 2 (S) drives for
>         /dev/md101. I thus
>              started examining things. First I thought that it was some
>         mdadm
>              weirdness, where it failed to assemble the drive with all
>         components.
>              mdadm -A /dev/md101 /dev/sd[cdef]1 failed and gave the same
>         result.
>              Something was really wrong.
>
>              I checked and compared the output of mdadm --examine on all
>         drives
>              (like -Evvvs below) and found that /dev/sdc1's events count
>         was wrong.
>              /dev/sdf1 and /dev/sdd1 matched (and later sde1 too, but
>         more on
>              that in a sec). So sdc1 may have been dropped from the
>         array without
>              me knowing it, unlikely put possible. The odd thing is the huge
>              difference in event counts, but all four are marked as ACTIVE.
>
>              So then onto sde1; why was it failing on that. The gpt
>         table was
>              completly gone. 00000. Gone. I used hexdump to examine the
>         drive
>              further, and at 0x00041000 there was the mdraid table, as
>         one would
>              expect. Good, so it looks like only the gpt has been wiped
>         for some
>              misterious reason. Re-creating the gpt quickly revealed mdadm's
>              information was still correct (as can be seen below).
>
>              So ignore sdc1 and assemble the drive as is should be fine?
>         Right? No.
>              mdadm -A /dev/md101 /dev/sd[def]1 worked without error.
>
>              I always do a fsck before and after a reboot (unless of
>         course I
>              can't do the shutdown fsck) and verify /proc/mdadm after a
>         boot. So
>              before mounting, as always, I tried to run fsck /dev/md101
>         -C -; but
>              that came up with tons of errors. I didn't fix anything and
>         aborted.
>
>              And here we are now. I can't just copy the entire disk
>         (1.5TB per
>              disk) and 'experiment', I don't have 4 spare disks. The
>         first thing
>              I would want to try is is mdadm -A /dev/sd[cdf]1 --force
>         (leave out
>              the possibly corrupted sde1) and see what that does.
>
>
>              All that said when I did the assemble with the 'guessed' 3
>         correct
>              drives. Did of course increase the events count. sdc1 of course
>              didn't partake in this. Assuming that it is in sync with
>         the rest,
>              what is the worst that can happen? And does the --read-only
>         flag
>              protect against it?
>
>
>              Linux riley 3.7.4-gentoo #2 SMP Tue Feb 5 16:20:59 CET 2013
>         x86_64
>              AMD Phenom(tm) II X4 905e Processor AuthenticAMD GNU/Linux
>
>              riley tmp # mdadm --version
>              mdadm - v3.1.4 - 31st August 2010
>
>
>              riley tmp # mdadm -Evvvvs
>              /dev/sdf1:
>                         Magic : a92b4efc
>                       Version : 1.2
>                   Feature Map : 0x0
>                    Array UUID : 2becc012:2d317133:2447784c:____1aab300d
>                          Name : riley:data01  (local to host riley)
>                 Creation Time : Tue Apr 27 18:03:37 2010
>                    Raid Level : raid5
>                  Raid Devices : 4
>
>                Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
>                    Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>                 Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>                   Data Offset : 272 sectors
>                  Super Offset : 8 sectors
>                         State : clean
>                   Device UUID : 97877935:04c16c5f:0746cb98:____63bffb4c
>
>                   Update Time : Sat Apr  6 11:46:03 2013
>                      Checksum : b585717a - correct
>                        Events : 512993
>
>                        Layout : left-symmetric
>                    Chunk Size : 256K
>
>                  Device Role : Active device 1
>                  Array State : AA.A ('A' == active, '.' == missing)
>              mdadm: No md superblock detected on /dev/sdf.
>              /dev/sde1:
>                         Magic : a92b4efc
>                       Version : 1.2
>                   Feature Map : 0x0
>                    Array UUID : 2becc012:2d317133:2447784c:____1aab300d
>                          Name : riley:data01  (local to host riley)
>                 Creation Time : Tue Apr 27 18:03:37 2010
>                    Raid Level : raid5
>                  Raid Devices : 4
>
>                Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
>                    Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>                 Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>                   Data Offset : 776 sectors
>                  Super Offset : 8 sectors
>                         State : clean
>                   Device UUID : 3f48d5a8:e3ee47a1:23c8b895:____addd3dd0
>
>                   Update Time : Sat Apr  6 11:46:03 2013
>                      Checksum : eaec006b - correct
>                        Events : 512993
>
>                        Layout : left-symmetric
>                    Chunk Size : 256K
>
>                  Device Role : Active device 3
>                  Array State : AA.A ('A' == active, '.' == missing)
>              mdadm: No md superblock detected on /dev/sde.
>              /dev/sdd1:
>                         Magic : a92b4efc
>                       Version : 1.2
>                   Feature Map : 0x0
>                    Array UUID : 2becc012:2d317133:2447784c:____1aab300d
>                          Name : riley:data01  (local to host riley)
>                 Creation Time : Tue Apr 27 18:03:37 2010
>                    Raid Level : raid5
>                  Raid Devices : 4
>
>                Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
>                    Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>                 Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>                   Data Offset : 272 sectors
>                  Super Offset : 8 sectors
>                         State : clean
>                   Device UUID : 236f6c48:2a1bcf6b:a7d7d861:____53950637
>
>                   Update Time : Sat Apr  6 11:46:03 2013
>                      Checksum : 87f31abb - correct
>                        Events : 512993
>
>                        Layout : left-symmetric
>                    Chunk Size : 256K
>
>                  Device Role : Active device 0
>                  Array State : AA.A ('A' == active, '.' == missing)
>              mdadm: No md superblock detected on /dev/sdd.
>              /dev/sdc1:
>                         Magic : a92b4efc
>                       Version : 1.2
>                   Feature Map : 0x0
>                    Array UUID : 2becc012:2d317133:2447784c:____1aab300d
>                          Name : riley:data01  (local to host riley)
>                 Creation Time : Tue Apr 27 18:03:37 2010
>                    Raid Level : raid5
>                  Raid Devices : 4
>
>                Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
>                    Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>                 Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>                   Data Offset : 272 sectors
>                  Super Offset : 8 sectors
>                         State : active
>                   Device UUID : 3ce8e262:ad864aee:9055af9b:____6cbfd47f
>
>                   Update Time : Sat Mar 16 20:20:47 2013
>                      Checksum : a7686a57 - correct
>                        Events : 180132
>
>                        Layout : left-symmetric
>                    Chunk Size : 256K
>
>                  Device Role : Active device 2
>                  Array State : AAAA ('A' == active, '.' == missing)
>              mdadm: No md superblock detected on /dev/sdc.
>
>
>              Before I assembled the array for the first time (mdadm -A
>         /dev/md101
>              /dev/sdd1 /dev/sde1 /dev/sdf1), this is how it looked like:
>              So identical to the above, wtih the exception of the number
>         of events.
>
>              riley tmp # mdadm --examine /dev/sde1
>              /dev/sde1:
>                         Magic : a92b4efc
>                       Version : 1.2
>                   Feature Map : 0x0
>                    Array UUID : 2becc012:2d317133:2447784c:____1aab300d
>                          Name : riley:data01  (local to host riley)
>                 Creation Time : Tue Apr 27 18:03:37 2010
>                    Raid Level : raid5
>                  Raid Devices : 4
>
>                Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
>                    Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
>                 Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
>                   Data Offset : 776 sectors
>                  Super Offset : 8 sectors
>                         State : clean
>                   Device UUID : 3f48d5a8:e3ee47a1:23c8b895:____addd3dd0
>
>                   Update Time : Sat Apr  6 09:44:30 2013
>                      Checksum : eaebe3ea - correct
>                        Events : 512989
>
>                        Layout : left-symmetric
>                    Chunk Size : 256K
>
>                  Device Role : Active device 3
>                  Array State : AA.A ('A' == active, '.' == missing)
>
>              --
>              To unsubscribe from this list: send the line "unsubscribe
>         linux-raid" in
>              the body of a message to majordomo@vger.kernel.org
>         <mailto:majordomo@vger.kernel.org>
>              <mailto:majordomo@vger.kernel.__org
>         <mailto:majordomo@vger.kernel.org>>
>              More majordomo info at
>         http://vger.kernel.org/____majordomo-info.html
>         <http://vger.kernel.org/__majordomo-info.html>
>              <http://vger.kernel.org/__majordomo-info.html
>         <http://vger.kernel.org/majordomo-info.html>>
>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <CACj=ugQR6hjw0qchJiOtgyWd8VRGs_pkZCBXHbQwjrKFz4u=Xg@mail.gmail.com>]

* Re: Help, array corrupted after clean shutdown.
       [not found]         ` <CACj=ugQR6hjw0qchJiOtgyWd8VRGs_pkZCBXHbQwjrKFz4u=Xg@mail.gmail.com>
@ 2013-04-07 15:32           ` Oliver Schinagl
  2013-04-08  8:10             ` Durval Menezes
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-07 15:32 UTC (permalink / raw)
  Cc: Linux RAID

On 06-04-13 20:59, Durval Menezes wrote:
> Hi Oliver,
>
> On Sat, Apr 6, 2013 at 3:01 PM, Oliver Schinagl 
> <oliver+list@schinagl.nl <mailto:oliver+list@schinagl.nl>> wrote:
>
>     On 04/06/13 19:44, Durval Menezes wrote:
>
>         Hi Oliver,
>
>         Seems most of your problems are filesystem corruption (the
>         extN family
>         is well known for lack of robustness).
>
>         I would try to mount the filesystem read-only (without fsck)
>         and copy
>         off as much data as possible... Then fsck and try to copy the
>         rest.
>
>         Good luck.
>
>     It fails to mount ;)
>
>     How can I ensure that the array is not corrupt however (while
>     degraded)? At least that way, I can try my luck with ext4 tools.
>
>
> If the array was not degraded, I would try an array check:
>
> |echo check > /sys/block/md0/md/sync_action|
>
> Then, if you had no (or very little) mismatches, I would consider it 
> OK. But as your array is in degraded mode, you have no redundancy to 
> enable you to check... :-/
I guess the 'order' wouldn't have mattered. I would have expected some 
very basic check was available.

Maybe for raid8 :p; Thinking along the lines, every block has an id, and 
each stripe has maching id's. If the id's no longer match, something is 
wrong. Would probably only waste space in the end.

Anyhow, I may have panicked a little to early. mount did indeed fail to 
mount, checking dmesg revealed a little more:
[  117.665385] EXT4-fs (md102): mounted filesystem with writeback data 
mode. Opts: commit=120,data=writeback
[  126.743000] EXT4-fs (md101): ext4_check_descriptors: Checksum for 
group 0 failed (42475!=15853)
[  126.743003] EXT4-fs (md101): group descriptors corrupted!

I asked on linux-ext4 what could be going wrong, fsck-ing -n does show 
(all?) group-descriptors not matching.
Mounting ro however works and all data appears to be correct from a 
quick investigation (my virtual machines start normally, so if that is 
ok, the rest must be too. I am now in the progress of copying, and rsycn 
-car the data to a temporary spot. Thanks for all the help though, I 
probably would have kept trying to fix the array first.

I'm still wondering why my entire (and only the) partition table was gone.
>
> Cheers,
> -- 
>    Durval.
>
>
>
>
>         --
>             Durval.
>
>         On Apr 6, 2013 12:13 PM, "Oliver Schinagl"
>         <oliver+list@schinagl.nl <mailto:oliver%2Blist@schinagl.nl>
>         <mailto:oliver%2Blist@schinagl.nl
>         <mailto:oliver%252Blist@schinagl.nl>>> wrote:
>
>             On 04/06/13 17:06, Durval Menezes wrote:
>
>                 Oliver,
>
>                 What file system? LVM or direct on the MD device?
>
>             Sorry, should have mentioned this.
>
>             I have 4 1.5 TB sata drives, connected to the onboard sata
>         controller.
>
>             I have made 1 GPT partition ontop of each drive and then
>         made a
>             raid5 array ontop of those devices:
>
>             md101 : active (read-only) raid5 sdd1[0] sde1[4] sdf1[1]
>                    4395413760 blocks super 1.2 level 5, 256k chunk,
>         algorithm 2
>             [4/3] [UU_U]
>
>             I then formatted /dev/md101 with ext4.
>
>             Tune2fs still happily runs on /dev/md101, but of course
>         that doesn't
>             mean anything.
>
>             riley tmp # tune2fs -l /dev/md101
>             tune2fs 1.42 (29-Nov-2011)
>             Filesystem volume name:   data01
>             Last mounted on:          /tank/01
>             Filesystem UUID:  9c812d61-96ce-4b71-9763-__b77e8b9618d1
>
>             Filesystem magic number:  0xEF53
>             Filesystem revision #:    1 (dynamic)
>             Filesystem features:      has_journal ext_attr resize_inode
>             dir_index filetype extent flex_bg sparse_super large_file
>         huge_file
>             uninit_bg dir_nlink extra_isize
>             Filesystem flags:         signed_directory_hash
>             Default mount options:    (none)
>             Filesystem state:         not clean
>             Errors behavior:          Continue
>             Filesystem OS type:       Linux
>             Inode count:              274718720
>             Block count:              1098853440
>             Reserved block count:     0
>             Free blocks:              228693396
>             Free inodes:              274387775
>             First block:              0
>             Block size:               4096
>             Fragment size:            4096
>             Reserved GDT blocks:      762
>             Blocks per group:         32768
>             Fragments per group:      32768
>             Inodes per group:         8192
>             Inode blocks per group:   512
>             RAID stride:              64
>             RAID stripe width:        192
>             Flex block group size:    16
>             Filesystem created:       Wed Apr 28 16:42:58 2010
>             Last mount time:          Tue May  4 17:14:48 2010
>             Last write time:          Sat Apr  6 11:45:57 2013
>             Mount count:              10
>             Maximum mount count:      32
>             Last checked:             Wed Apr 28 16:42:58 2010
>             Check interval:           15552000 (6 months)
>             Next check after:         Mon Oct 25 16:42:58 2010
>             Lifetime writes:          3591 GB
>             Reserved blocks uid:      0 (user root)
>             Reserved blocks gid:      0 (group root)
>             First inode:              11
>             Inode size:               256
>             Required extra isize:     28
>             Desired extra isize:      28
>             Journal inode:            8
>             First orphan inode:       17
>             Default directory hash:   half_md4
>             Directory Hash Seed:  f1248a94-5a6a-4e4a-af8a-__68b019d13ef6
>
>             Journal backup:           inode blocks
>
>
>
>                 --
>                      Durval.
>
>                 On Apr 6, 2013 8:23 AM, "Oliver Schinagl"
>                 <oliver+list@schinagl.nl
>         <mailto:oliver%2Blist@schinagl.nl>
>         <mailto:oliver%2Blist@schinagl.nl
>         <mailto:oliver%252Blist@schinagl.nl>>
>                 <mailto:oliver%2Blist@
>         <mailto:oliver%252Blist@>__schinagl.nl <http://schinagl.nl>
>
>                 <mailto:oliver%252Blist@schinagl.nl
>         <mailto:oliver%25252Blist@schinagl.nl>>>> wrote:
>
>                      Hi,
>
>                      I've had a powerfailure today, to which my UPS
>         responded
>                 nicely and
>                      made my server shutdown normally. One would expect
>                 everything is
>                      well, right? The array, as far as I know, was
>         operating without
>                      problems before the shutdown, all 4 devices where
>         normally
>                 online.
>                      mdadm sends me an e-mail if something is wrong,
>         so does
>                 smartctl.
>
>                      First thing I noticed that I had 2 (S) drives for
>                 /dev/md101. I thus
>                      started examining things. First I thought that it
>         was some
>                 mdadm
>                      weirdness, where it failed to assemble the drive
>         with all
>                 components.
>                      mdadm -A /dev/md101 /dev/sd[cdef]1 failed and
>         gave the same
>                 result.
>                      Something was really wrong.
>
>                      I checked and compared the output of mdadm
>         --examine on all
>                 drives
>                      (like -Evvvs below) and found that /dev/sdc1's
>         events count
>                 was wrong.
>                      /dev/sdf1 and /dev/sdd1 matched (and later sde1
>         too, but
>                 more on
>                      that in a sec). So sdc1 may have been dropped
>         from the
>                 array without
>                      me knowing it, unlikely put possible. The odd
>         thing is the huge
>                      difference in event counts, but all four are
>         marked as ACTIVE.
>
>                      So then onto sde1; why was it failing on that.
>         The gpt
>                 table was
>                      completly gone. 00000. Gone. I used hexdump to
>         examine the
>                 drive
>                      further, and at 0x00041000 there was the mdraid
>         table, as
>                 one would
>                      expect. Good, so it looks like only the gpt has
>         been wiped
>                 for some
>                      misterious reason. Re-creating the gpt quickly
>         revealed mdadm's
>                      information was still correct (as can be seen below).
>
>                      So ignore sdc1 and assemble the drive as is
>         should be fine?
>                 Right? No.
>                      mdadm -A /dev/md101 /dev/sd[def]1 worked without
>         error.
>
>                      I always do a fsck before and after a reboot
>         (unless of
>                 course I
>                      can't do the shutdown fsck) and verify
>         /proc/mdadm after a
>                 boot. So
>                      before mounting, as always, I tried to run fsck
>         /dev/md101
>                 -C -; but
>                      that came up with tons of errors. I didn't fix
>         anything and
>                 aborted.
>
>                      And here we are now. I can't just copy the entire
>         disk
>                 (1.5TB per
>                      disk) and 'experiment', I don't have 4 spare
>         disks. The
>                 first thing
>                      I would want to try is is mdadm -A /dev/sd[cdf]1
>         --force
>                 (leave out
>                      the possibly corrupted sde1) and see what that does.
>
>
>                      All that said when I did the assemble with the
>         'guessed' 3
>                 correct
>                      drives. Did of course increase the events count.
>         sdc1 of course
>                      didn't partake in this. Assuming that it is in
>         sync with
>                 the rest,
>                      what is the worst that can happen? And does the
>         --read-only
>                 flag
>                      protect against it?
>
>
>                      Linux riley 3.7.4-gentoo #2 SMP Tue Feb 5
>         16:20:59 CET 2013
>                 x86_64
>                      AMD Phenom(tm) II X4 905e Processor AuthenticAMD
>         GNU/Linux
>
>                      riley tmp # mdadm --version
>                      mdadm - v3.1.4 - 31st August 2010
>
>
>                      riley tmp # mdadm -Evvvvs
>                      /dev/sdf1:
>                                 Magic : a92b4efc
>                               Version : 1.2
>                           Feature Map : 0x0
>                            Array UUID :
>         2becc012:2d317133:2447784c:____1aab300d
>
>                                  Name : riley:data01  (local to host
>         riley)
>                         Creation Time : Tue Apr 27 18:03:37 2010
>                            Raid Level : raid5
>                          Raid Devices : 4
>
>                        Avail Dev Size : 2930276351 (1397.26 GiB
>         1500.30 GB)
>                            Array Size : 8790827520 (4191.79 GiB
>         4500.90 GB)
>                         Used Dev Size : 2930275840 (1397.26 GiB
>         1500.30 GB)
>                           Data Offset : 272 sectors
>                          Super Offset : 8 sectors
>                                 State : clean
>                           Device UUID :
>         97877935:04c16c5f:0746cb98:____63bffb4c
>
>
>                           Update Time : Sat Apr  6 11:46:03 2013
>                              Checksum : b585717a - correct
>                                Events : 512993
>
>                                Layout : left-symmetric
>                            Chunk Size : 256K
>
>                          Device Role : Active device 1
>                          Array State : AA.A ('A' == active, '.' ==
>         missing)
>                      mdadm: No md superblock detected on /dev/sdf.
>                      /dev/sde1:
>                                 Magic : a92b4efc
>                               Version : 1.2
>                           Feature Map : 0x0
>                            Array UUID :
>         2becc012:2d317133:2447784c:____1aab300d
>
>                                  Name : riley:data01  (local to host
>         riley)
>                         Creation Time : Tue Apr 27 18:03:37 2010
>                            Raid Level : raid5
>                          Raid Devices : 4
>
>                        Avail Dev Size : 2930275847 (1397.26 GiB
>         1500.30 GB)
>                            Array Size : 8790827520 (4191.79 GiB
>         4500.90 GB)
>                         Used Dev Size : 2930275840 (1397.26 GiB
>         1500.30 GB)
>                           Data Offset : 776 sectors
>                          Super Offset : 8 sectors
>                                 State : clean
>                           Device UUID :
>         3f48d5a8:e3ee47a1:23c8b895:____addd3dd0
>
>
>                           Update Time : Sat Apr  6 11:46:03 2013
>                              Checksum : eaec006b - correct
>                                Events : 512993
>
>                                Layout : left-symmetric
>                            Chunk Size : 256K
>
>                          Device Role : Active device 3
>                          Array State : AA.A ('A' == active, '.' ==
>         missing)
>                      mdadm: No md superblock detected on /dev/sde.
>                      /dev/sdd1:
>                                 Magic : a92b4efc
>                               Version : 1.2
>                           Feature Map : 0x0
>                            Array UUID :
>         2becc012:2d317133:2447784c:____1aab300d
>
>                                  Name : riley:data01  (local to host
>         riley)
>                         Creation Time : Tue Apr 27 18:03:37 2010
>                            Raid Level : raid5
>                          Raid Devices : 4
>
>                        Avail Dev Size : 2930276351 (1397.26 GiB
>         1500.30 GB)
>                            Array Size : 8790827520 (4191.79 GiB
>         4500.90 GB)
>                         Used Dev Size : 2930275840 (1397.26 GiB
>         1500.30 GB)
>                           Data Offset : 272 sectors
>                          Super Offset : 8 sectors
>                                 State : clean
>                           Device UUID :
>         236f6c48:2a1bcf6b:a7d7d861:____53950637
>
>
>                           Update Time : Sat Apr  6 11:46:03 2013
>                              Checksum : 87f31abb - correct
>                                Events : 512993
>
>                                Layout : left-symmetric
>                            Chunk Size : 256K
>
>                          Device Role : Active device 0
>                          Array State : AA.A ('A' == active, '.' ==
>         missing)
>                      mdadm: No md superblock detected on /dev/sdd.
>                      /dev/sdc1:
>                                 Magic : a92b4efc
>                               Version : 1.2
>                           Feature Map : 0x0
>                            Array UUID :
>         2becc012:2d317133:2447784c:____1aab300d
>
>                                  Name : riley:data01  (local to host
>         riley)
>                         Creation Time : Tue Apr 27 18:03:37 2010
>                            Raid Level : raid5
>                          Raid Devices : 4
>
>                        Avail Dev Size : 2930276351 (1397.26 GiB
>         1500.30 GB)
>                            Array Size : 8790827520 (4191.79 GiB
>         4500.90 GB)
>                         Used Dev Size : 2930275840 (1397.26 GiB
>         1500.30 GB)
>                           Data Offset : 272 sectors
>                          Super Offset : 8 sectors
>                                 State : active
>                           Device UUID :
>         3ce8e262:ad864aee:9055af9b:____6cbfd47f
>
>
>                           Update Time : Sat Mar 16 20:20:47 2013
>                              Checksum : a7686a57 - correct
>                                Events : 180132
>
>                                Layout : left-symmetric
>                            Chunk Size : 256K
>
>                          Device Role : Active device 2
>                          Array State : AAAA ('A' == active, '.' ==
>         missing)
>                      mdadm: No md superblock detected on /dev/sdc.
>
>
>                      Before I assembled the array for the first time
>         (mdadm -A
>                 /dev/md101
>                      /dev/sdd1 /dev/sde1 /dev/sdf1), this is how it
>         looked like:
>                      So identical to the above, wtih the exception of
>         the number
>                 of events.
>
>                      riley tmp # mdadm --examine /dev/sde1
>                      /dev/sde1:
>                                 Magic : a92b4efc
>                               Version : 1.2
>                           Feature Map : 0x0
>                            Array UUID :
>         2becc012:2d317133:2447784c:____1aab300d
>
>                                  Name : riley:data01  (local to host
>         riley)
>                         Creation Time : Tue Apr 27 18:03:37 2010
>                            Raid Level : raid5
>                          Raid Devices : 4
>
>                        Avail Dev Size : 2930275847 (1397.26 GiB
>         1500.30 GB)
>                            Array Size : 8790827520 (4191.79 GiB
>         4500.90 GB)
>                         Used Dev Size : 2930275840 (1397.26 GiB
>         1500.30 GB)
>                           Data Offset : 776 sectors
>                          Super Offset : 8 sectors
>                                 State : clean
>                           Device UUID :
>         3f48d5a8:e3ee47a1:23c8b895:____addd3dd0
>
>
>                           Update Time : Sat Apr  6 09:44:30 2013
>                              Checksum : eaebe3ea - correct
>                                Events : 512989
>
>                                Layout : left-symmetric
>                            Chunk Size : 256K
>
>                          Device Role : Active device 3
>                          Array State : AA.A ('A' == active, '.' ==
>         missing)
>
>                      --
>                      To unsubscribe from this list: send the line
>         "unsubscribe
>                 linux-raid" in
>                      the body of a message to
>         majordomo@vger.kernel.org <mailto:majordomo@vger.kernel.org>
>                 <mailto:majordomo@vger.kernel.org
>         <mailto:majordomo@vger.kernel.org>>
>                      <mailto:majordomo@vger.kernel.
>         <mailto:majordomo@vger.kernel.>__org
>
>                 <mailto:majordomo@vger.kernel.org
>         <mailto:majordomo@vger.kernel.org>>>
>                      More majordomo info at
>         http://vger.kernel.org/____majordomo-info.html
>                 <http://vger.kernel.org/__majordomo-info.html>
>                      <http://vger.kernel.org/__majordomo-info.html
>                 <http://vger.kernel.org/majordomo-info.html>>
>
>
>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help, array corrupted after clean shutdown.
  2013-04-07 15:32           ` Oliver Schinagl
@ 2013-04-08  8:10             ` Durval Menezes
  2013-04-07 17:12               ` Oliver Schinagl
  0 siblings, 1 reply; 9+ messages in thread
From: Durval Menezes @ 2013-04-08  8:10 UTC (permalink / raw)
  To: Oliver Schinagl; +Cc: Linux RAID

Hi Oliver.

On Sun, Apr 7, 2013 at 12:32 PM, Oliver Schinagl
<oliver+list@schinagl.nl> wrote:
>
> On 06-04-13 20:59, Durval Menezes wrote:
>>
>> Hi Oliver,
>>
>>
>> On Sat, Apr 6, 2013 at 3:01 PM, Oliver Schinagl <oliver+list@schinagl.nl
>> <mailto:oliver+list@schinagl.nl>> wrote:
>>
>>     On 04/06/13 19:44, Durval Menezes wrote:
>>
>>         Hi Oliver,
>>
>>         Seems most of your problems are filesystem corruption (the
>>         extN family
>>         is well known for lack of robustness).
>>
>>         I would try to mount the filesystem read-only (without fsck)
>>         and copy
>>         off as much data as possible... Then fsck and try to copy the
>>         rest.
>>
>>         Good luck.
>>
>>     It fails to mount ;)
>>
>>     How can I ensure that the array is not corrupt however (while
>>     degraded)? At least that way, I can try my luck with ext4 tools.
>>
>>
>> If the array was not degraded, I would try an array check:
>>
>> |echo check > /sys/block/md0/md/sync_action|
>>
>> Then, if you had no (or very little) mismatches, I would consider it OK.
>> But as your array is in degraded mode, you have no redundancy to enable you
>> to check... :-/
>
> I guess the 'order' wouldn't have mattered. I would have expected some
> very basic check was available.
>
> Maybe for raid8 :p; Thinking along the lines, every block has an id, and
> each stripe has maching id's. If the id's no longer match, something is
> wrong. Would probably only waste space in the end.

And time ;-)

> Anyhow, I may have panicked a little to early. mount did indeed fail to
> mount, checking dmesg revealed a little more:
> [  117.665385] EXT4-fs (md102): mounted filesystem with writeback data
> mode. Opts: commit=120,data=writeback
> [  126.743000] EXT4-fs (md101): ext4_check_descriptors: Checksum for group
> 0 failed (42475!=15853)
> [  126.743003] EXT4-fs (md101): group descriptors corrupted!
>
> I asked on linux-ext4 what could be going wrong, fsck-ing -n does show
> (all?) group-descriptors not matching.

Ouch :-/

> Mounting ro however works

Glad to hear it. When you said that "it fails to mount", I thought you
had tried mounting read-only as I suggested.

> and all data appears to be correct from a quick
> investigation (my virtual machines start normally, so if that is ok, the
> rest must be too.

So probably only ext4 allocation metadata (which I think is what the
group descriptors are) got corrupted... probably your data survived
OK.

> I am now in the progress of copying, and rsycn -car the
> data to a temporary spot.

After your data is copied, try validating it with whatever tools
available, for example: for compressed files, try checking them (ex:
"tar tvzf" for tar.gz files); if it's your root partition, try
checking your distribution packages (rpm -Va on RPM distros, for
example), etc. If it shows any corrupted data, it might point you
towards things that need restoring, and if it shows nothing wrong, it
will give you confidence that the rest of your (uncheckable) data is
possibly good too.

> Thanks for all the help though, I probably would
> have kept trying to fix the array first.

No prob, and good luck with the rest of your recovery!


> I'm still wondering why my entire (and only the) partition table was gone.

One theory: as your shutdown was clean, then ext4 allocation metadata
has probably been badly mangled in memory before the shutdown, so some
of your data was possibly written over the start of the disk,
clobbering the GPT.

Off (Linux md RAID) topic: If I were in your place, I would start
worrying how the in-memory metadata was SILENTLY mangled in the first
place... do you use ECC memory, for example? Also, I would consider
(now that you will have to mkfs the mangled partition to restore your
data anyway) using a filesystem that has multiple metadata copies and
also the means for not only finding out about silent corruptions but
also for fixing them, to say nothing of a built-in RAID with no
write-hole and that gives your data the same silent-corruption
detection-and-fixing feature: http://zfsonlinux.org/

Cheers,
-- 
   Durval.


Cheers,
-- 
   Durval.

>>
>> Cheers,
>> --
>>    Durval.
>>
>>
>>
>>
>>         --
>>             Durval.
>>
>>         On Apr 6, 2013 12:13 PM, "Oliver Schinagl"
>>         <oliver+list@schinagl.nl <mailto:oliver%2Blist@schinagl.nl>
>>         <mailto:oliver%2Blist@schinagl.nl
>>
>>         <mailto:oliver%252Blist@schinagl.nl>>> wrote:
>>
>>             On 04/06/13 17:06, Durval Menezes wrote:
>>
>>                 Oliver,
>>
>>                 What file system? LVM or direct on the MD device?
>>
>>             Sorry, should have mentioned this.
>>
>>             I have 4 1.5 TB sata drives, connected to the onboard sata
>>         controller.
>>
>>             I have made 1 GPT partition ontop of each drive and then
>>         made a
>>             raid5 array ontop of those devices:
>>
>>             md101 : active (read-only) raid5 sdd1[0] sde1[4] sdf1[1]
>>                    4395413760 blocks super 1.2 level 5, 256k chunk,
>>         algorithm 2
>>             [4/3] [UU_U]
>>
>>             I then formatted /dev/md101 with ext4.
>>
>>             Tune2fs still happily runs on /dev/md101, but of course
>>         that doesn't
>>             mean anything.
>>
>>             riley tmp # tune2fs -l /dev/md101
>>             tune2fs 1.42 (29-Nov-2011)
>>             Filesystem volume name:   data01
>>             Last mounted on:          /tank/01
>>             Filesystem UUID:  9c812d61-96ce-4b71-9763-__b77e8b9618d1
>>
>>             Filesystem magic number:  0xEF53
>>             Filesystem revision #:    1 (dynamic)
>>             Filesystem features:      has_journal ext_attr resize_inode
>>             dir_index filetype extent flex_bg sparse_super large_file
>>         huge_file
>>             uninit_bg dir_nlink extra_isize
>>             Filesystem flags:         signed_directory_hash
>>             Default mount options:    (none)
>>             Filesystem state:         not clean
>>             Errors behavior:          Continue
>>             Filesystem OS type:       Linux
>>             Inode count:              274718720
>>             Block count:              1098853440
>>             Reserved block count:     0
>>             Free blocks:              228693396
>>             Free inodes:              274387775
>>             First block:              0
>>             Block size:               4096
>>             Fragment size:            4096
>>             Reserved GDT blocks:      762
>>             Blocks per group:         32768
>>             Fragments per group:      32768
>>             Inodes per group:         8192
>>             Inode blocks per group:   512
>>             RAID stride:              64
>>             RAID stripe width:        192
>>             Flex block group size:    16
>>             Filesystem created:       Wed Apr 28 16:42:58 2010
>>             Last mount time:          Tue May  4 17:14:48 2010
>>             Last write time:          Sat Apr  6 11:45:57 2013
>>             Mount count:              10
>>             Maximum mount count:      32
>>             Last checked:             Wed Apr 28 16:42:58 2010
>>             Check interval:           15552000 (6 months)
>>             Next check after:         Mon Oct 25 16:42:58 2010
>>             Lifetime writes:          3591 GB
>>             Reserved blocks uid:      0 (user root)
>>             Reserved blocks gid:      0 (group root)
>>             First inode:              11
>>             Inode size:               256
>>             Required extra isize:     28
>>             Desired extra isize:      28
>>             Journal inode:            8
>>             First orphan inode:       17
>>             Default directory hash:   half_md4
>>             Directory Hash Seed:  f1248a94-5a6a-4e4a-af8a-__68b019d13ef6
>>
>>             Journal backup:           inode blocks
>>
>>
>>
>>                 --
>>                      Durval.
>>
>>                 On Apr 6, 2013 8:23 AM, "Oliver Schinagl"
>>                 <oliver+list@schinagl.nl
>>         <mailto:oliver%2Blist@schinagl.nl>
>>         <mailto:oliver%2Blist@schinagl.nl
>>         <mailto:oliver%252Blist@schinagl.nl>>
>>                 <mailto:oliver%2Blist@
>>         <mailto:oliver%252Blist@>__schinagl.nl <http://schinagl.nl>
>>
>>                 <mailto:oliver%252Blist@schinagl.nl
>>
>>         <mailto:oliver%25252Blist@schinagl.nl>>>> wrote:
>>
>>                      Hi,
>>
>>                      I've had a powerfailure today, to which my UPS
>>         responded
>>                 nicely and
>>                      made my server shutdown normally. One would expect
>>                 everything is
>>                      well, right? The array, as far as I know, was
>>         operating without
>>                      problems before the shutdown, all 4 devices where
>>         normally
>>                 online.
>>                      mdadm sends me an e-mail if something is wrong,
>>         so does
>>                 smartctl.
>>
>>                      First thing I noticed that I had 2 (S) drives for
>>                 /dev/md101. I thus
>>                      started examining things. First I thought that it
>>         was some
>>                 mdadm
>>                      weirdness, where it failed to assemble the drive
>>         with all
>>                 components.
>>                      mdadm -A /dev/md101 /dev/sd[cdef]1 failed and
>>         gave the same
>>                 result.
>>                      Something was really wrong.
>>
>>                      I checked and compared the output of mdadm
>>         --examine on all
>>                 drives
>>                      (like -Evvvs below) and found that /dev/sdc1's
>>         events count
>>                 was wrong.
>>                      /dev/sdf1 and /dev/sdd1 matched (and later sde1
>>         too, but
>>                 more on
>>                      that in a sec). So sdc1 may have been dropped
>>         from the
>>                 array without
>>                      me knowing it, unlikely put possible. The odd
>>         thing is the huge
>>                      difference in event counts, but all four are
>>         marked as ACTIVE.
>>
>>                      So then onto sde1; why was it failing on that.
>>         The gpt
>>                 table was
>>                      completly gone. 00000. Gone. I used hexdump to
>>         examine the
>>                 drive
>>                      further, and at 0x00041000 there was the mdraid
>>         table, as
>>                 one would
>>                      expect. Good, so it looks like only the gpt has
>>         been wiped
>>                 for some
>>                      misterious reason. Re-creating the gpt quickly
>>         revealed mdadm's
>>                      information was still correct (as can be seen
>> below).
>>
>>                      So ignore sdc1 and assemble the drive as is
>>         should be fine?
>>                 Right? No.
>>                      mdadm -A /dev/md101 /dev/sd[def]1 worked without
>>         error.
>>
>>                      I always do a fsck before and after a reboot
>>         (unless of
>>                 course I
>>                      can't do the shutdown fsck) and verify
>>         /proc/mdadm after a
>>                 boot. So
>>                      before mounting, as always, I tried to run fsck
>>         /dev/md101
>>                 -C -; but
>>                      that came up with tons of errors. I didn't fix
>>         anything and
>>                 aborted.
>>
>>                      And here we are now. I can't just copy the entire
>>         disk
>>                 (1.5TB per
>>                      disk) and 'experiment', I don't have 4 spare
>>         disks. The
>>                 first thing
>>                      I would want to try is is mdadm -A /dev/sd[cdf]1
>>         --force
>>                 (leave out
>>                      the possibly corrupted sde1) and see what that does.
>>
>>
>>                      All that said when I did the assemble with the
>>         'guessed' 3
>>                 correct
>>                      drives. Did of course increase the events count.
>>         sdc1 of course
>>                      didn't partake in this. Assuming that it is in
>>         sync with
>>                 the rest,
>>                      what is the worst that can happen? And does the
>>         --read-only
>>                 flag
>>                      protect against it?
>>
>>
>>                      Linux riley 3.7.4-gentoo #2 SMP Tue Feb 5
>>         16:20:59 CET 2013
>>                 x86_64
>>                      AMD Phenom(tm) II X4 905e Processor AuthenticAMD
>>         GNU/Linux
>>
>>                      riley tmp # mdadm --version
>>                      mdadm - v3.1.4 - 31st August 2010
>>
>>
>>                      riley tmp # mdadm -Evvvvs
>>                      /dev/sdf1:
>>                                 Magic : a92b4efc
>>                               Version : 1.2
>>                           Feature Map : 0x0
>>                            Array UUID :
>>         2becc012:2d317133:2447784c:____1aab300d
>>
>>                                  Name : riley:data01  (local to host
>>         riley)
>>                         Creation Time : Tue Apr 27 18:03:37 2010
>>                            Raid Level : raid5
>>                          Raid Devices : 4
>>
>>                        Avail Dev Size : 2930276351 (1397.26 GiB
>>         1500.30 GB)
>>                            Array Size : 8790827520 (4191.79 GiB
>>         4500.90 GB)
>>                         Used Dev Size : 2930275840 (1397.26 GiB
>>         1500.30 GB)
>>                           Data Offset : 272 sectors
>>                          Super Offset : 8 sectors
>>                                 State : clean
>>                           Device UUID :
>>         97877935:04c16c5f:0746cb98:____63bffb4c
>>
>>
>>                           Update Time : Sat Apr  6 11:46:03 2013
>>                              Checksum : b585717a - correct
>>                                Events : 512993
>>
>>                                Layout : left-symmetric
>>                            Chunk Size : 256K
>>
>>                          Device Role : Active device 1
>>                          Array State : AA.A ('A' == active, '.' ==
>>         missing)
>>                      mdadm: No md superblock detected on /dev/sdf.
>>                      /dev/sde1:
>>                                 Magic : a92b4efc
>>                               Version : 1.2
>>                           Feature Map : 0x0
>>                            Array UUID :
>>         2becc012:2d317133:2447784c:____1aab300d
>>
>>                                  Name : riley:data01  (local to host
>>         riley)
>>                         Creation Time : Tue Apr 27 18:03:37 2010
>>                            Raid Level : raid5
>>                          Raid Devices : 4
>>
>>                        Avail Dev Size : 2930275847 (1397.26 GiB
>>         1500.30 GB)
>>                            Array Size : 8790827520 (4191.79 GiB
>>         4500.90 GB)
>>                         Used Dev Size : 2930275840 (1397.26 GiB
>>         1500.30 GB)
>>                           Data Offset : 776 sectors
>>                          Super Offset : 8 sectors
>>                                 State : clean
>>                           Device UUID :
>>         3f48d5a8:e3ee47a1:23c8b895:____addd3dd0
>>
>>
>>                           Update Time : Sat Apr  6 11:46:03 2013
>>                              Checksum : eaec006b - correct
>>                                Events : 512993
>>
>>                                Layout : left-symmetric
>>                            Chunk Size : 256K
>>
>>                          Device Role : Active device 3
>>                          Array State : AA.A ('A' == active, '.' ==
>>         missing)
>>                      mdadm: No md superblock detected on /dev/sde.
>>                      /dev/sdd1:
>>                                 Magic : a92b4efc
>>                               Version : 1.2
>>                           Feature Map : 0x0
>>                            Array UUID :
>>         2becc012:2d317133:2447784c:____1aab300d
>>
>>                                  Name : riley:data01  (local to host
>>         riley)
>>                         Creation Time : Tue Apr 27 18:03:37 2010
>>                            Raid Level : raid5
>>                          Raid Devices : 4
>>
>>                        Avail Dev Size : 2930276351 (1397.26 GiB
>>         1500.30 GB)
>>                            Array Size : 8790827520 (4191.79 GiB
>>         4500.90 GB)
>>                         Used Dev Size : 2930275840 (1397.26 GiB
>>         1500.30 GB)
>>                           Data Offset : 272 sectors
>>                          Super Offset : 8 sectors
>>                                 State : clean
>>                           Device UUID :
>>         236f6c48:2a1bcf6b:a7d7d861:____53950637
>>
>>
>>                           Update Time : Sat Apr  6 11:46:03 2013
>>                              Checksum : 87f31abb - correct
>>                                Events : 512993
>>
>>                                Layout : left-symmetric
>>                            Chunk Size : 256K
>>
>>                          Device Role : Active device 0
>>                          Array State : AA.A ('A' == active, '.' ==
>>         missing)
>>                      mdadm: No md superblock detected on /dev/sdd.
>>                      /dev/sdc1:
>>                                 Magic : a92b4efc
>>                               Version : 1.2
>>                           Feature Map : 0x0
>>                            Array UUID :
>>         2becc012:2d317133:2447784c:____1aab300d
>>
>>                                  Name : riley:data01  (local to host
>>         riley)
>>                         Creation Time : Tue Apr 27 18:03:37 2010
>>                            Raid Level : raid5
>>                          Raid Devices : 4
>>
>>                        Avail Dev Size : 2930276351 (1397.26 GiB
>>         1500.30 GB)
>>                            Array Size : 8790827520 (4191.79 GiB
>>         4500.90 GB)
>>                         Used Dev Size : 2930275840 (1397.26 GiB
>>         1500.30 GB)
>>                           Data Offset : 272 sectors
>>                          Super Offset : 8 sectors
>>                                 State : active
>>                           Device UUID :
>>         3ce8e262:ad864aee:9055af9b:____6cbfd47f
>>
>>
>>                           Update Time : Sat Mar 16 20:20:47 2013
>>                              Checksum : a7686a57 - correct
>>                                Events : 180132
>>
>>                                Layout : left-symmetric
>>                            Chunk Size : 256K
>>
>>                          Device Role : Active device 2
>>                          Array State : AAAA ('A' == active, '.' ==
>>         missing)
>>                      mdadm: No md superblock detected on /dev/sdc.
>>
>>
>>                      Before I assembled the array for the first time
>>         (mdadm -A
>>                 /dev/md101
>>                      /dev/sdd1 /dev/sde1 /dev/sdf1), this is how it
>>         looked like:
>>                      So identical to the above, wtih the exception of
>>         the number
>>                 of events.
>>
>>                      riley tmp # mdadm --examine /dev/sde1
>>                      /dev/sde1:
>>                                 Magic : a92b4efc
>>                               Version : 1.2
>>                           Feature Map : 0x0
>>                            Array UUID :
>>         2becc012:2d317133:2447784c:____1aab300d
>>
>>                                  Name : riley:data01  (local to host
>>         riley)
>>                         Creation Time : Tue Apr 27 18:03:37 2010
>>                            Raid Level : raid5
>>                          Raid Devices : 4
>>
>>                        Avail Dev Size : 2930275847 (1397.26 GiB
>>         1500.30 GB)
>>                            Array Size : 8790827520 (4191.79 GiB
>>         4500.90 GB)
>>                         Used Dev Size : 2930275840 (1397.26 GiB
>>         1500.30 GB)
>>                           Data Offset : 776 sectors
>>                          Super Offset : 8 sectors
>>                                 State : clean
>>                           Device UUID :
>>         3f48d5a8:e3ee47a1:23c8b895:____addd3dd0
>>
>>
>>                           Update Time : Sat Apr  6 09:44:30 2013
>>                              Checksum : eaebe3ea - correct
>>                                Events : 512989
>>
>>                                Layout : left-symmetric
>>                            Chunk Size : 256K
>>
>>                          Device Role : Active device 3
>>                          Array State : AA.A ('A' == active, '.' ==
>>         missing)
>>
>>                      --
>>                      To unsubscribe from this list: send the line
>>         "unsubscribe
>>                 linux-raid" in
>>                      the body of a message to
>>         majordomo@vger.kernel.org <mailto:majordomo@vger.kernel.org>
>>                 <mailto:majordomo@vger.kernel.org
>>         <mailto:majordomo@vger.kernel.org>>
>>                      <mailto:majordomo@vger.kernel.
>>         <mailto:majordomo@vger.kernel.>__org
>>
>>                 <mailto:majordomo@vger.kernel.org
>>
>>         <mailto:majordomo@vger.kernel.org>>>
>>                      More majordomo info at
>>         http://vger.kernel.org/____majordomo-info.html
>>                 <http://vger.kernel.org/__majordomo-info.html>
>>                      <http://vger.kernel.org/__majordomo-info.html
>>                 <http://vger.kernel.org/majordomo-info.html>>
>>
>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help, array corrupted after clean shutdown.
  2013-04-08  8:10             ` Durval Menezes
@ 2013-04-07 17:12               ` Oliver Schinagl
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-07 17:12 UTC (permalink / raw)
  To: Durval Menezes; +Cc: Linux RAID

On 08-04-13 10:10, Durval Menezes wrote:
> Hi Oliver.
>
> On Sun, Apr 7, 2013 at 12:32 PM, Oliver Schinagl
> <oliver+list@schinagl.nl> wrote:
>> On 06-04-13 20:59, Durval Menezes wrote:
>>> Hi Oliver,
>>>
>>>
>>> On Sat, Apr 6, 2013 at 3:01 PM, Oliver Schinagl <oliver+list@schinagl.nl
>>> <mailto:oliver+list@schinagl.nl>> wrote:
>>>
>>>      On 04/06/13 19:44, Durval Menezes wrote:
>>>
>>>          Hi Oliver,
>>>
>>>          Seems most of your problems are filesystem corruption (the
>>>          extN family
>>>          is well known for lack of robustness).
>>>
>>>          I would try to mount the filesystem read-only (without fsck)
>>>          and copy
>>>          off as much data as possible... Then fsck and try to copy the
>>>          rest.
>>>
>>>          Good luck.
>>>
>>>      It fails to mount ;)
>>>
>>>      How can I ensure that the array is not corrupt however (while
>>>      degraded)? At least that way, I can try my luck with ext4 tools.
>>>
>>>
>>> If the array was not degraded, I would try an array check:
>>>
>>> |echo check > /sys/block/md0/md/sync_action|
>>>
>>> Then, if you had no (or very little) mismatches, I would consider it OK.
>>> But as your array is in degraded mode, you have no redundancy to enable you
>>> to check... :-/
>> I guess the 'order' wouldn't have mattered. I would have expected some
>> very basic check was available.
>>
>> Maybe for raid8 :p; Thinking along the lines, every block has an id, and
>> each stripe has maching id's. If the id's no longer match, something is
>> wrong. Would probably only waste space in the end.
> And time ;-)
>
>> Anyhow, I may have panicked a little to early. mount did indeed fail to
>> mount, checking dmesg revealed a little more:
>> [  117.665385] EXT4-fs (md102): mounted filesystem with writeback data
>> mode. Opts: commit=120,data=writeback
>> [  126.743000] EXT4-fs (md101): ext4_check_descriptors: Checksum for group
>> 0 failed (42475!=15853)
>> [  126.743003] EXT4-fs (md101): group descriptors corrupted!
>>
>> I asked on linux-ext4 what could be going wrong, fsck-ing -n does show
>> (all?) group-descriptors not matching.
> Ouch :-/
>
>> Mounting ro however works
> Glad to hear it. When you said that "it fails to mount", I thought you
> had tried mounting read-only as I suggested.
mount complained, like when you use an invalid filesystem. The error 
could have been more descriptive. I tried mounting RO after you 
mentioned it (and marking the array as read-only).
>
>> and all data appears to be correct from a quick
>> investigation (my virtual machines start normally, so if that is ok, the
>> rest must be too.
> So probably only ext4 allocation metadata (which I think is what the
> group descriptors are) got corrupted... probably your data survived
> OK.
Looks like, the disk reports an unhealthy amount of freespace. But every 
single group descriptor got corrupted. Starting from 0, 1 .. 32k (and 
then I ctrl-c-ed). It's odd to get corrupted in that way. Well the 
checksum didn't match. I'd rather think either the on-disk format 
changed since 2010 somewhat, or usertools work differently.

Side story mode, I have an android tablet with ext4 filesystem for 
/data. The tablet runs a 3.0 kernel. A few weeks ago, the tablet refused 
to boot. I booted from SD card into a stock GNU/Linux 3.4 enviroment and 
ran fdisk. Same thing, all group descriptors where corrupt (didn't 
match). fsck ran for 10 minutes and its still working fine.
>
>> I am now in the progress of copying, and rsycn -car the
>> data to a temporary spot.
> After your data is copied, try validating it with whatever tools
> available, for example: for compressed files, try checking them (ex:
> "tar tvzf" for tar.gz files); if it's your root partition, try
> checking your distribution packages (rpm -Va on RPM distros, for
> example), etc. If it shows any corrupted data, it might point you
> towards things that need restoring, and if it shows nothing wrong, it
> will give you confidence that the rest of your (uncheckable) data is
> possibly good too.
It does look that the data survived just fine. It is a pure data disk, 
but did contain some virtual machines. kvm runs them all fine at the moment.

While I could just fsck the fs and get it all good again, I have now all 
data from the device. I will use that to increase the chunksize from 256 
to 512k, and remake the fs with those new parameters. I'm sure fsck will 
most likly fix it and nothing will be wrong. I'm simply not willing to 
take the risk now that the disks are empty anyway.
>
>> Thanks for all the help though, I probably would
>> have kept trying to fix the array first.
> No prob, and good luck with the rest of your recovery!
Thank you ;)
>
>
>> I'm still wondering why my entire (and only the) partition table was gone.
> One theory: as your shutdown was clean, then ext4 allocation metadata
> has probably been badly mangled in memory before the shutdown, so some
> of your data was possibly written over the start of the disk,
> clobbering the GPT.
>
> Off (Linux md RAID) topic: If I were in your place, I would start
> worrying how the in-memory metadata was SILENTLY mangled in the first
> place... do you use ECC memory, for example? Also, I would consider
> (now that you will have to mkfs the mangled partition to restore your
> data anyway) using a filesystem that has multiple metadata copies and
> also the means for not only finding out about silent corruptions but
> also for fixing them, to say nothing of a built-in RAID with no
> write-hole and that gives your data the same silent-corruption
> detection-and-fixing feature: http://zfsonlinux.org/
>
> Cheers,


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Help, array corrupted after clean shutdown.
@ 2013-04-06 18:34 Oliver Schinagl
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-06 18:34 UTC (permalink / raw)
  To: linux-raid

Hi,

the following message was initially sent to linux-raid, but there they 
said it was very most likly an ext4 corruption. I left it in tact as it 
explains what happened to as much detail as I could provide. I marked 
the start of that message with =======

Some additional information, I have put the ext4 filesystem on top of a 
md raid5 array of 4 disks and tune2fs reports the following:

riley tmp # tune2fs -l /dev/md101
tune2fs 1.42 (29-Nov-2011)
Filesystem volume name:   data01
Last mounted on:          /tank/01
Filesystem UUID:          9c812d61-96ce-4b71-9763-b77e8b9618d1
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index 
filetype extent flex_bg sparse_super large_file huge_file uninit_bg 
dir_nlink extra_isize
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         not clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              274718720
Block count:              1098853440
Reserved block count:     0
Free blocks:              228693396
Free inodes:              274387775
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      762
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
RAID stride:              64
RAID stripe width:        192
Flex block group size:    16
Filesystem created:       Wed Apr 28 16:42:58 2010
Last mount time:          Tue May  4 17:14:48 2010
Last write time:          Sat Apr  6 11:45:57 2013
Mount count:              10
Maximum mount count:      32
Last checked:             Wed Apr 28 16:42:58 2010
Check interval:           15552000 (6 months)
Next check after:         Mon Oct 25 16:42:58 2010
Lifetime writes:          3591 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
First orphan inode:       17
Default directory hash:   half_md4
Directory Hash Seed:      f1248a94-5a6a-4e4a-af8a-68b019d13ef6
Journal backup:           inode blocks

The strange thing is, 'last checked' and is set to somewhere in 2010. 
While an automatic check never has run, I always check with fsck -f 
/dev/md101 -C - before each mount and before/after each reboot 
religiously. The last write time is most likly when I tried to run fsck 
and it came with tons of errors, well only 1 error repeating after each 
descriptor:

riley tmp # fsck -n /dev/md101 -C -
fsck from util-linux 2.21.2
e2fsck 1.42 (29-Nov-2011)
One or more block group descriptor checksums are invalid.  Fix? no

Group descriptor 0 checksum is invalid.  IGNORED.
Group descriptor 1 checksum is invalid.  IGNORED.
Group descriptor 2 checksum is invalid.  IGNORED.

It hasn't automatically tried the backup blocks yet, but I am very very 
hesitant to let it get fixed, in case I loose everything.

I can mount it in 'ro' only mode, but have no clue on the validity  of 
the data. (I tried mounting normally earlier today, but got errors about 
not being an ext4 fs).

Looking at dmesg after the successful mount I get:
[38006.011956] EXT4-fs (md101): ext4_check_descriptors: Checksum for 
group 33532 failed (23179!=34446)
[38006.011958] EXT4-fs (md101): ext4_check_descriptors: Checksum for 
group 33533 failed (64080!=9813)
[38006.011960] EXT4-fs (md101): ext4_check_descriptors: Checksum for 
group 33534 failed (44694!=46442)
[38006.026547] EXT4-fs (md101): write access unavailable, skipping 
orphan cleanup
[38006.026548] EXT4-fs (md101): recovery complete
[38006.026551] EXT4-fs (md101): mounted filesystem with writeback data 
mode. Opts: commit=120,data=writeback

(I can't scroll up to group 0, 1, 2 to mach the above bit, and I 
ctrl-c'ed fsck -n around there).

Should I be worried? I do not think I changed my kernel in the last two 
weeks (def. not a new version) but it is possible I recompiled and 
installed it without reboot, to enable certain features.

I will try meanwhile to copy 4 TB of data elsewhere (lets pray I have 
enough space) and if so re-create my ext4 FS?

Thanks, and sorry about the confusing with the new/original message bit. 
Corrupted FS (that of course are not properly backed up) have had me on 
edge all day :(

Oliver

=======
original message:

I've had a powerfailure today, to which my UPS responded nicely and made 
my server shutdown normally. One would expect everything is well, right? 
The array, as far as I know, was operating without problems before the 
shutdown, all 4 devices where normally online. mdadm sends me an e-mail 
if something is wrong, so does smartctl.

First thing I noticed that I had 2 (S) drives for /dev/md101. I thus 
started examining things. First I thought that it was some mdadm 
weirdness, where it failed to assemble the drive with all components.
mdadm -A /dev/md101 /dev/sd[cdef]1 failed and gave the same result. 
Something was really wrong.

I checked and compared the output of mdadm --examine on all drives (like 
-Evvvs below) and found that /dev/sdc1's events count was wrong.
/dev/sdf1 and /dev/sdd1 matched (and later sde1 too, but more on that in 
a sec). So sdc1 may have been dropped from the array without me knowing 
it, unlikely put possible. The odd thing is the huge difference in event 
counts, but all four are marked as ACTIVE.

So then onto sde1; why was it failing on that. The gpt table was 
completly gone. 00000. Gone. I used hexdump to examine the drive 
further, and at 0x00041000 there was the mdraid table, as one would 
expect. Good, so it looks like only the gpt has been wiped for some 
misterious reason. Re-creating the gpt quickly revealed mdadm's 
information was still correct (as can be seen below).

So ignore sdc1 and assemble the drive as is should be fine? Right? No.
mdadm -A /dev/md101 /dev/sd[def]1 worked without error.

I always do a fsck before and after a reboot (unless of course I can't 
do the shutdown fsck) and verify /proc/mdadm after a boot. So before 
mounting, as always, I tried to run fsck /dev/md101 -C -; but that came 
up with tons of errors. I didn't fix anything and aborted.

And here we are now. I can't just copy the entire disk (1.5TB per disk) 
and 'experiment', I don't have 4 spare disks. The first thing I would 
want to try is is mdadm -A /dev/sd[cdf]1 --force (leave out the possibly 
corrupted sde1) and see what that does.

All that said when I did the assemble with the 'guessed' 3 correct 
drives. Did of course increase the events count. sdc1 of course didn't 
partake in this. Assuming that it is in sync with the rest, what is the 
worst that can happen? And does the --read-only flag protect against it?

Linux riley 3.7.4-gentoo #2 SMP Tue Feb 5 16:20:59 CET 2013 x86_64 AMD 
Phenom(tm) II X4 905e Processor AuthenticAMD GNU/Linux

riley tmp # mdadm --version
mdadm - v3.1.4 - 31st August 2010

riley tmp # mdadm -Evvvvs
/dev/sdf1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 97877935:04c16c5f:0746cb98:63bffb4c

     Update Time : Sat Apr  6 11:46:03 2013
        Checksum : b585717a - correct
          Events : 512993

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 1
    Array State : AA.A ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdf.
/dev/sde1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 776 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 3f48d5a8:e3ee47a1:23c8b895:addd3dd0

     Update Time : Sat Apr  6 11:46:03 2013
        Checksum : eaec006b - correct
          Events : 512993

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 3
    Array State : AA.A ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sde.
/dev/sdd1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 236f6c48:2a1bcf6b:a7d7d861:53950637

     Update Time : Sat Apr  6 11:46:03 2013
        Checksum : 87f31abb - correct
          Events : 512993

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 0
    Array State : AA.A ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdd.
/dev/sdc1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930276351 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : active
     Device UUID : 3ce8e262:ad864aee:9055af9b:6cbfd47f

     Update Time : Sat Mar 16 20:20:47 2013
        Checksum : a7686a57 - correct
          Events : 180132

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 2
    Array State : AAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdc.

Before I assembled the array for the first time (mdadm -A /dev/md101 
/dev/sdd1 /dev/sde1 /dev/sdf1), this is how it looked like:
So identical to the above, wtih the exception of the number of events.

riley tmp # mdadm --examine /dev/sde1
/dev/sde1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 2becc012:2d317133:2447784c:1aab300d
            Name : riley:data01  (local to host riley)
   Creation Time : Tue Apr 27 18:03:37 2010
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 2930275847 (1397.26 GiB 1500.30 GB)
      Array Size : 8790827520 (4191.79 GiB 4500.90 GB)
   Used Dev Size : 2930275840 (1397.26 GiB 1500.30 GB)
     Data Offset : 776 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 3f48d5a8:e3ee47a1:23c8b895:addd3dd0

     Update Time : Sat Apr  6 09:44:30 2013
        Checksum : eaebe3ea - correct
          Events : 512989

          Layout : left-symmetric
      Chunk Size : 256K

    Device Role : Active device 3
    Array State : AA.A ('A' == active, '.' == missing)

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-04-08  8:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-06 11:24 Help, array corrupted after clean shutdown Oliver Schinagl
2013-04-06 11:58 ` Mikael Abrahamsson
2013-04-06 12:04   ` Oliver Schinagl
     [not found] ` <CACj=ugTsNd87z4Uq_KdZa_HJYFNTtxwZJ76bv0GNHUj8D66YTA@mail.gmail.com>
2013-04-06 15:14   ` Oliver Schinagl
     [not found]     ` <CACj=ugSH2YBrePTKy3e36H4fcHpKQ8ywxrJoLJwbqtbvOR+pEQ@mail.gmail.com>
2013-04-06 18:01       ` Oliver Schinagl
     [not found]         ` <CACj=ugQR6hjw0qchJiOtgyWd8VRGs_pkZCBXHbQwjrKFz4u=Xg@mail.gmail.com>
2013-04-07 15:32           ` Oliver Schinagl
2013-04-08  8:10             ` Durval Menezes
2013-04-07 17:12               ` Oliver Schinagl
  -- strict thread matches above, loose matches on Subject: below --
2013-04-06 18:34 Oliver Schinagl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).