linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Tomas Agartz <tlund@nxs.se>
Cc: linux-raid@vger.kernel.org
Subject: Re: How to assemble 4-disk raid5 with one broken disk and one marked as spare by operator error?
Date: Mon, 9 Dec 2013 14:46:45 +1100	[thread overview]
Message-ID: <20131209144645.70e01149@notabene.brown> (raw)
In-Reply-To: <Pine.LNX.4.61.1312082056110.13787@envy.nxs.se>

[-- Attachment #1: Type: text/plain, Size: 3255 bytes --]

On Sun, 8 Dec 2013 21:04:42 +0100 (CET) Tomas Agartz <tlund@nxs.se> wrote:

> After booting a server that had been powered off for some time, the 4-disk 
> raid5 device was up and running in read-only mode with one disk missing. 
> After a, in hindsight, hasty decision, "mdadm --manage --add /dev/md0 
> /dev/sdd" was executed to re-add the missing device to the array.
> 
> At this time, all hell broke loose :) The first thing that happened was 
> that sdd was added as a spare instead of re-added as expected. The second 
> thing was that a different disk, sdb, was kicked from the array because of 
> read/sata-bus errors. The root disk also bailed and the system had to be 
> powercycled.

If you want to re-add, it is safest to ask mdadm to --re-add, not to --add.

> 
> The real problem, from the start, was probably that sdb was bad all along, 
> but from some reason sdd was the device missing from the array after the 
> initial boot.
> 
> Trying to read data from sdb gives read errors and timeouts, but I was 
> able to do "mdadm --examine" after resetting the sata port.
> 
> The current state is that, out of 4 disks two are good (sde and sdf), one 
> is (in error) marked as a spare (sdd), and the fourth device is unusable 
> (sdb).
> 
> What is the correct method do change the spare disk back to a data disk 
> and try to restart the array with 3 out of 4 devices (sdd, sde and sdf)?
> 

The only real option at this point is to --create the array.  There isn't
enough information for mdadm to be able  to do anything clever.

> The device has never had a spare, so I think that sdd used to be "Active 
> device 0" before this happened?
> 
> Possibly relevant data from mdadm --examine on the four devices:
> 
> sdb          State : clean
> sdb         Events : 333560
> sdb   Device Role : Active device 3
> sdb   Array State : .AAA ('A' == active, '.' == missing)
> 
> sdd          State : clean
> sdd         Events : 333562
> sdd   Device Role : spare
> sdd   Array State : .AA. ('A' == active, '.' == missing)
> 
> sde          State : clean
> sde         Events : 333562
> sde   Device Role : Active device 1
> sde   Array State : .AA. ('A' == active, '.' == missing)
> 
> sdf          State : clean
> sdf         Events : 333562
> sdf   Device Role : Active device 2
> sdf   Array State : .AA. ('A' == active, '.' == missing)
> 
> If no one else has any better suggestions, my best guess would be to: 
> "mdadm --create /dev/md0 --level=5 --raid-devices=4 --assume-clean 
> /dev/sdd /dev/sde /dev/sdf missing" (the device was created with default 
> values, metadata 1.2, chunk size 512K, layout left-symmetric).

Check the "Data Offset" of the devices and make sure the newly created array
gets the same "Data Offset" (it can explicitly be set with the latest mdadm).

NeilBrown


> 
> (Other crazy ideas involve editing the superblock of sdd and making it 
> device 0 and then trying to start the array after that).
> 
> Best regards,
> Tomas
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2013-12-09  3:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-08 20:04 How to assemble 4-disk raid5 with one broken disk and one marked as spare by operator error? Tomas Agartz
2013-12-09  3:46 ` NeilBrown [this message]
2013-12-09 11:27   ` Tomas Agartz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131209144645.70e01149@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=tlund@nxs.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).