mdadm raid5 dropped 2 disks

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* mdadm raid5 dropped 2 disks
@ 2016-02-05 10:30 André Teichert
  2016-02-06  9:45 ` Wols Lists
  2016-02-06 10:54 ` André Teichert
  0 siblings, 2 replies; 5+ messages in thread
From: André Teichert @ 2016-02-05 10:30 UTC (permalink / raw)
  To: linux-raid

Hi,

I had a raid5 (mdadm V. 3.2.5) with 3 disks. Within an hour 2 disks dropped.
Both disks show smart error 184, but I can still read them.

First I did a full dd-copy of each disk to imagefile image[123] and 
wrote it back to a large 4tb disk with 3 partitions.

mdadm -E /dev/sda1
/dev/sda1:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 8bf0a3b8:a98e95fd:6a0884e6:fbe6ab09
            Name : server:0  (local to host server)
   Creation Time : Sun Nov 24 04:21:09 2013
      Raid Level : raid5
    Raid Devices : 3

  Avail Dev Size : 1953262961 (931.39 GiB 1000.07 GB)
      Array Size : 1953262592 (1862.78 GiB 2000.14 GB)
   Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
           State : active
     Device UUID : 6f793025:415d8c8b:e7d37bbb:19524380

     Update Time : Wed Feb  3 10:16:27 2016
        Checksum : 74a4a730 - correct
          Events : 311

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 2
    Array State : .AA ('A' == active, '.' == missing)


mdadm -E /dev/sda2
/dev/sda2:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 8bf0a3b8:a98e95fd:6a0884e6:fbe6ab09
            Name : server:0  (local to host server)
   Creation Time : Sun Nov 24 04:21:09 2013
      Raid Level : raid5
    Raid Devices : 3

  Avail Dev Size : 1953262961 (931.39 GiB 1000.07 GB)
      Array Size : 1953262592 (1862.78 GiB 2000.14 GB)
   Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : fc963d80:307b6345:c95b6d94:162c7c7c

     Update Time : Wed Feb  3 10:16:40 2016
        Checksum : 5eaf449a - correct
          Events : 314

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 1
    Array State : .A. ('A' == active, '.' == missing)


mdadm -E /dev/sda3
/dev/sda3:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 8bf0a3b8:a98e95fd:6a0884e6:fbe6ab09
            Name : server:0  (local to host server)
   Creation Time : Sun Nov 24 04:21:09 2013
      Raid Level : raid5
    Raid Devices : 3

  Avail Dev Size : 1953262961 (931.39 GiB 1000.07 GB)
      Array Size : 1953262592 (1862.78 GiB 2000.14 GB)
   Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
           State : active
     Device UUID : 73b1275f:8600a6b4:51234150:e035eef3

     Update Time : Wed Feb  3 09:37:09 2016
        Checksum : e024ac15 - correct
          Events : 217

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 0
    Array State : AAA ('A' == active, '.' == missing)



Seems like sda3 dropped first and big difference in events so I started 
only sda1+sda2
mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sda2
Seemed to work and assembled the raid with 2/3 disks clean.

The filesystem is ext4.
Runnung "fsck -y /dev/md0" with lots of errors.

mount -t ext4 /dev/md0 /mnt didnt recognize the filesystem


Should I try --create --assmume-clean sda1 sda2 missing?
I try to stay calm and pray for help.
thx a lot

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm raid5 dropped 2 disks
  2016-02-05 10:30 mdadm raid5 dropped 2 disks André Teichert
@ 2016-02-06  9:45 ` Wols Lists
  2016-02-06 10:54 ` André Teichert
  1 sibling, 0 replies; 5+ messages in thread
From: Wols Lists @ 2016-02-06  9:45 UTC (permalink / raw)
  To: André Teichert, linux-raid

On 05/02/16 10:30, André Teichert wrote:
> Should I try --create --assmume-clean sda1 sda2 missing?
> I try to stay calm and pray for help.
> thx a lot

Find and download lsdrv by Phil Turmel. You'll need Python 2.7 to run it
iirc.

DO NOT EVEN CONSIDER running "mdadm --create" until you've got all the
data you can out of the drive headers (lsdrv is intended to get this data).

Post the data from lsdrv to the list, and then someone else will chime
in and help you further.

(If you get --create wrong, you've probably just permanently lost the
contents of your array ...)

Cheers,
Wol

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm raid5 dropped 2 disks
  2016-02-05 10:30 mdadm raid5 dropped 2 disks André Teichert
  2016-02-06  9:45 ` Wols Lists
@ 2016-02-06 10:54 ` André Teichert
  2016-02-06 11:46   ` Wols Lists
  1 sibling, 1 reply; 5+ messages in thread
From: André Teichert @ 2016-02-06 10:54 UTC (permalink / raw)
  To: linux-raid

Hi Wol,
the raid is save now. I didnt work on the disks itself only with images.

dd if=/dev/sd[abc]1 of=/mnt/image[123].dd bs=1

losetup /dev/loop1 /mnt/image1.dd
losetup /dev/loop2 /mnt/image2.dd

mdadm --create /dev/md0 --raid-devices=3 --level=5 --assume-clean 
--size=976631296 --raid-devices=3 missing /dev/loop2 /dev/loop1

fsck without any errors. Data is back!

What I maybe did wrong in the first run?
1) I wrote the first backup of the disks to partitions, that were a 
little bigger than the original one. These were fresh drives with no 
data writen so far.
2) I did the create with the right order (missing loo2 loop1)  from 
mdadm -E "Device Role"

glad it worked. Will save lsdrv and hope never to use it. Create new 
backupplan!

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm raid5 dropped 2 disks
  2016-02-06 10:54 ` André Teichert
@ 2016-02-06 11:46   ` Wols Lists
  2016-02-06 14:03     ` Phil Turmel
  0 siblings, 1 reply; 5+ messages in thread
From: Wols Lists @ 2016-02-06 11:46 UTC (permalink / raw)
  To: André Teichert, linux-raid

On 06/02/16 10:54, André Teichert wrote:
> glad it worked. Will save lsdrv and hope never to use it. Create new
> backupplan!

Please DO use lsdrv. EVERY time you create or change the config of an
array, run it and save the info somewhere safe.

Neil has just stepped down as maintainer and Phil is now "one of the
team", he wrote that utility specifically to help in recovering crashed
arrays. If anything goes wrong and you've got that data to hand, it'll
make their lives much easier to help you.

Cheers,
Wol
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: mdadm raid5 dropped 2 disks
  2016-02-06 11:46   ` Wols Lists
@ 2016-02-06 14:03     ` Phil Turmel
  0 siblings, 0 replies; 5+ messages in thread
From: Phil Turmel @ 2016-02-06 14:03 UTC (permalink / raw)
  To: Wols Lists, André Teichert, linux-raid

Good morning André, Wol,

On 02/06/2016 06:46 AM, Wols Lists wrote:
> On 06/02/16 10:54, André Teichert wrote:
>> glad it worked. Will save lsdrv and hope never to use it. Create new
>> backupplan!
> 
> Please DO use lsdrv. EVERY time you create or change the config of an
> array, run it and save the info somewhere safe.

Yes, lsdrv is intended to document a running system for later use when
not running.  Especially serial numbers, device names, UUIDs, and layer
relationships.  It is still helpful later, but naturally some info can
be missed.

> Neil has just stepped down as maintainer and Phil is now "one of the
> team",

That's a bit of a stretch.  I'm not maintaining any code and I'm
volunteering on the list as I have in the past.

> he wrote that utility specifically to help in recovering crashed
> arrays. If anything goes wrong and you've got that data to hand, it'll
> make their lives much easier to help you.

Yes, array configuration documentation is critical to recovery.

Phil

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-02-06 14:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-05 10:30 mdadm raid5 dropped 2 disks André Teichert
2016-02-06  9:45 ` Wols Lists
2016-02-06 10:54 ` André Teichert
2016-02-06 11:46   ` Wols Lists
2016-02-06 14:03     ` Phil Turmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).