From: NeilBrown <neilb@suse.de>
To: "Sergiusz Brzeziński" <Sergiusz.Brzezinski@supersystem.pl>
Cc: linux-raid@vger.kernel.org
Subject: Re: mdadm -add doesn't start rebuilding array
Date: Tue, 21 Aug 2012 08:30:40 +1000 [thread overview]
Message-ID: <20120821083040.0d466ceb@notabene.brown> (raw)
In-Reply-To: <5032050A.4000801@supersystem.pl>
[-- Attachment #1: Type: text/plain, Size: 6943 bytes --]
On Mon, 20 Aug 2012 11:36:10 +0200 Sergiusz Brzeziński
<Sergiusz.Brzezinski@supersystem.pl> wrote:
> Hi,
>
> My system is Ubuntu 12.04, kernel 3.2.0-27. mdadm: 3.2.3
>
> If I do:
>
> # mdadm /dev/md0 -a /dev/sdc3
>
> then nothing happen! No message on command line, no info in logs!
>
> Man mdadm says:
>
> -a, --add
> hot-add listed devices. If a device appears to have recently been part of the
> array (possibly it failed or was removed) the device is re-added as describe
> in the next point. If that fails or the device was never part of the
> array, the device is added as a hot-spare. If the array is degraded, it will
> immediately start to rebuild data onto that spare.
>
> But array doesn't wont to rebuild.
>
> But not exactly. The bad info is, that sometimes it works and sometimes it doesn't.
>
> And if I restart the system, the array SOMETIMES start rebuilding and somethimes
> doesn't!
>
> There is no information what's up (comnand line or logs) so I don't know what to do.
>
>
>
> I describe bellow the whole procedure:
>
> 1.
> There ist a working Raid1 array:
>
> # cat /proc/mdstat
>
> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4]
> [raid10]
> md0 : active raid1 sda3[2] sdb3[3]
> 115999672 blocks super 1.0 [2/2] [UU]
> bitmap: 0/1 pages [0KB], 65536KB chunk
>
> # mdadm --detail /dev/md0
>
> /dev/md0:
> Version : 1.0
> Creation Time : Wed Aug 1 15:45:56 2012
> Raid Level : raid1
> Array Size : 115999672 (110.63 GiB 118.78 GB)
> Used Dev Size : 115999672 (110.63 GiB 118.78 GB)
> Raid Devices : 2
> Total Devices : 2
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Mon Aug 20 11:10:00 2012
> State : active
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Name : linux:0
> UUID : b7407176:2e88c73d:2c85e940:05ff7e06
> Events : 5897
>
> Number Major Minor RaidDevice State
> 3 8 19 0 active sync /dev/sdb3
> 2 8 3 1 active sync /dev/sda3
>
>
> 2.
> I remove (phisicaly) the "/dev/sdb" drive of the box (hot-swap)
>
> # cat /proc/mdstat
>
> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4]
> [raid10]
> md0 : active raid1 sda3[2] sdb3[3](F)
> 115999672 blocks super 1.0 [2/1] [_U]
> bitmap: 1/1 pages [4KB], 65536KB chunk
>
> # mdadm --detail /dev/md0
> /dev/md0:
> Version : 1.0
> Creation Time : Wed Aug 1 15:45:56 2012
> Raid Level : raid1
> Array Size : 115999672 (110.63 GiB 118.78 GB)
> Used Dev Size : 115999672 (110.63 GiB 118.78 GB)
> Raid Devices : 2
> Total Devices : 2
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Mon Aug 20 11:11:58 2012
> State : active, degraded
> Active Devices : 1
> Working Devices : 1
> Failed Devices : 1
> Spare Devices : 0
>
> Name : linux:0
> UUID : b7407176:2e88c73d:2c85e940:05ff7e06
> Events : 5908
>
> Number Major Minor RaidDevice State
> 0 0 0 0 removed
> 2 8 3 1 active sync /dev/sda3
>
> 3 8 19 - faulty spare
>
>
> 3.
> I insert the drive again - it become "/dev/sdc" instead of "/dev/sdb"
>
> 4.
> And now I do the following to rebuild the array:
>
> # mdadm /dev/md0 -a /dev/sdc3
>
> mdadm: /dev/sdc3 reports being an active member for /dev/md0, but a --re-add fails.
> mdadm: not performing --add as that would convert /dev/sdc3 in to a spare.
> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdc3" first.
>
> I have never seen this messages before (on another systems). But this is not a
> problem. I do, what they suggest:
>
> # mdadm --zero-superblock /dev/sdc3
>
> # mdadm --examine /dev/sdc3
> mdadm: No md superblock detected on /dev/sdc3.
>
> 5.
> And now I try again rebuild the array:
>
> # mdadm /dev/md0 -a /dev/sdc3
>
> ... and nothing happen! No message, no info. I repeat some times the command.
>
> I also try this:
>
> # mdadm /dev/md0 -a -vv --force /dev/sdc3
>
> ... also nothing. Somethimes, if I repeat the command one by one, in logs appear
> the line:
>
> Aug 20 11:30:52 serwer-linmot kernel: [ 1978.111282] md: export_rdev(sdc3)
>
> ... and nothing else
>
>
> # mdadm --examine /dev/sdc3
> /dev/sdc3:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x1
> Array UUID : b7407176:2e88c73d:2c85e940:05ff7e06
> Name : linux:0
> Creation Time : Wed Aug 1 15:45:56 2012
> Raid Level : raid1
> Raid Devices : 2
>
> Avail Dev Size : 231999344 (110.63 GiB 118.78 GB)
> Array Size : 231999344 (110.63 GiB 118.78 GB)
> Super Offset : 231999472 sectors
> State : clean
> Device UUID : f1a1f291:e10d85cc:b574c6d2:45fc0c5d
>
> Internal Bitmap : -8 sectors from superblock
> Update Time : Mon Aug 20 11:17:07 2012
> Checksum : 71bcac04 - correct
> Events : 0
>
>
> Device Role : spare
> Array State : .A ('A' == active, '.' == missing)
>
>
> # mdadm --detail /dev/md0
>
> /dev/md0:
> Version : 1.0
> Creation Time : Wed Aug 1 15:45:56 2012
> Raid Level : raid1
> Array Size : 115999672 (110.63 GiB 118.78 GB)
> Used Dev Size : 115999672 (110.63 GiB 118.78 GB)
> Raid Devices : 2
> Total Devices : 2
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Mon Aug 20 11:18:43 2012
> State : active, degraded
> Active Devices : 1
> Working Devices : 1
> Failed Devices : 1
> Spare Devices : 0
>
> Name : linux:0
> UUID : b7407176:2e88c73d:2c85e940:05ff7e06
> Events : 5946
>
> Number Major Minor RaidDevice State
> 0 0 0 0 removed
> 2 8 3 1 active sync /dev/sda3
>
> 3 8 19 - faulty spare
>
>
> Can anyone help?
>
I cannot see why that would happen.
It seems to be failing somewhere in bind_rdev_to_array, but I don't know
where.
Can you run
strace -o /tmp/strace mdadm /dev/md0 --add /dev/sdc3
and post the output ... or at last the tail end of the output. There should
be an ioctl which fails and I need to know what the error was (EEXIST,
EINVAL, ENOSPC are the likely ones). But don't just post that line, include
at least the last 100 lines.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-08-20 22:30 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-20 9:36 mdadm -add doesn't start rebuilding array Sergiusz Brzeziński
2012-08-20 22:30 ` NeilBrown [this message]
[not found] ` <50332E9E.6050108@supersystem.pl>
2012-08-21 8:22 ` NeilBrown
2012-08-21 8:25 ` Dmitrijs Ledkovs
2012-08-21 9:37 ` Sergiusz Brzeziński
2012-08-21 10:19 ` Mikael Abrahamsson
2012-08-21 10:49 ` Sergiusz Brzeziński
2012-08-21 10:55 ` Dmitrijs Ledkovs
2012-08-21 12:09 ` Sergiusz Brzeziński
2012-08-21 12:25 ` Mikael Abrahamsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120821083040.0d466ceb@notabene.brown \
--to=neilb@suse.de \
--cc=Sergiusz.Brzezinski@supersystem.pl \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).