linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: "Pierre Vignéras" <pierre@vigneras.name>
Cc: Leslie Rhorer <lrhorer@satx.rr.com>, linux-raid@vger.kernel.org
Subject: Re: mdadm: failed devices become spares!
Date: Tue, 18 May 2010 11:30:16 +1000	[thread overview]
Message-ID: <20100518113016.1981a08c@notabene.brown> (raw)
In-Reply-To: <201005172010.36157.pierre@vigneras.name>

On Mon, 17 May 2010 20:10:36 +0200
Pierre Vignéras <pierre@vigneras.name> wrote:

> Did I miss something, or is there something really strange happening there?

Something strange...
I cannot explain the 'SpareActive' messages.
Most of the rest makes sense.

You had a RAID10 - 4 drives in near=2 mode.  So the first two disks contain
identical data, and the second two are also identical and contain the rest.
The second device failed due to a write error.
Why it seemed to become a spare I'm not sure.  I'm not all sure it did
become a spare immediately- your logs aren't conclusive on that point.
It did eventually become a spare, but that could be because you "removed and
added the devices" which would have changed them from 'fail' to 'spares'.

Then the first device in the array reported an error and so was failed.
After this you would not be able to read or write to the even chunks of the
array, xfs noticed and complained.

By this time sdf1 seemed to be a spare so it gave recovery a try.  The
recovery process discovered there was nowhere to read good data from and
immediately gave up.

However if the devices really are OK, then sdf1 and sdc1 should contain
identical data (except the superblock would be slightly different.
You could check this with "cmp -l", though that might not be very efficient.
Also sdd1 and sde1 should be identical.

I suggest that you try:

 mdadm -S /dev/md2
 mdadm -C /dev/md2 -l 10 -n 4 -c 64 -e 0.90 /dev/sdc1 missing /dev/sdd1 missing  --assume-clean

and then see what the data on md2 looks like.
You could equally try sdf1 in place of sdc1, or sde1 in place of sdd1
(make sure you double check the device names, don't assume I got then right).

Once you have a combination that look good, you can add the other two devices
an they will recover and you should have your data back.

BUT be warned.  Something cause some errors to be reported.  Unless you find
out what that was and fix it, errors will occur again.  I have no idea what
might have caused those errors.  Bad media? bad controller ? bad usb
controller? bad luck?

I wouldn't write new data, or even perform a recovery until you are quite
confident of the devices.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-05-18  1:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-16 15:40 mdadm: failed devices become spares! Pierre Vignéras
2010-05-16 19:56 ` Leslie Rhorer
2010-05-17 18:10   ` Pierre Vignéras
2010-05-17 21:09     ` Tim Small
2010-05-18  1:30     ` Neil Brown [this message]
2010-05-18  2:06       ` Neil Brown
2010-05-18 22:25         ` MRK
2010-05-19 19:56           ` Simon Matthews
2010-05-21 21:00           ` Pierre Vignéras
2010-05-21 21:27         ` mdadm: failed devices become spares! -> Solved ! Pierre Vignéras
2010-05-18 23:07       ` mdadm: failed devices become spares! Pierre Vignéras
2010-05-19  1:45         ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100518113016.1981a08c@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=lrhorer@satx.rr.com \
    --cc=pierre@vigneras.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).