From: NeilBrown <neilb@suse.de>
To: "David C. Rankin" <drankinatty@suddenlinkmail.com>
Cc: mdraid <linux-raid@vger.kernel.org>
Subject: Re: Raid1 where Event Count off my 1 cannot assemble --force
Date: Mon, 9 Dec 2013 12:00:40 +1100 [thread overview]
Message-ID: <20131209120040.6464b91b@notabene.brown> (raw)
In-Reply-To: <52A51122.3030604@suddenlinkmail.com>
[-- Attachment #1: Type: text/plain, Size: 8277 bytes --]
On Sun, 08 Dec 2013 18:38:58 -0600 "David C. Rankin"
<drankinatty@suddenlinkmail.com> wrote:
> On 12/08/2013 11:57 AM, David C. Rankin wrote:
> > On 12/08/2013 04:57 AM, Mikael Abrahamsson wrote:
> >> On Sun, 8 Dec 2013, David C. Rankin wrote:
> >>
> >>> Guys,
> >>>
> >>> I have an older box that is a fax server where the Event Count for /dev/md1 is
> >>> off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
> >>> /dev/sda5 /dev/sdb5.
> >>
> >> What are the messages displayed in "dmesg" when you try to use this command?
> >>
> >
> > Mikael,
> >
> > Following the commands:
> >
> > # mdadm --stop /dev/md1
> > # mdadm --assemble --force /dev/dm1 /dev/sd[ab]5
> >
> > The messages captured in the logs are:
> >
> > Rescue Kernel: md: md1: stopped.
> > Rescue Kernel: md: unbind<sda5>
> > Rescue Kernel: md: export_rdev(sda5)
> > Rescue Kernel: md: unbind<sdb5>
> > Rescue Kernel: md: export_rdev(sdb5)
> > Rescue Kernel: md: md1: stopped.
> > Rescue Kernel: md: md1 raid array is not clean -- starting background reconstruction
> > Rescue Kernel: md: raid1: raid set md1 active with 2 out of 2 mirrors
> > Rescue Kernel: md1: bitmap file is out of date (148 < 149) -- forcing full recovery
> > Rescue Kernel: md1: bitmap file is out of date, doing full recovery
> > Rescue Kernel: md1: bitmap initialisation failed: -5
> > Rescue Kernel: md1: failed to create bitmap (-5)
> >
> >
> > That's it for the log, then on the command line I have:
> >
> > mdadm: failed to RUN_ARRAY /dev/md1: Input/Output error
> >
> > What should I try next? Don't hesitate to ask if you need any additional
> > information, I'll provide whatever is necessary. Thanks.
> >
>
> Here is additional information with --verbose given:
>
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
> 221929772 blocks super 1.0 [2/2] [UU]
> bitmap: 0/424 pages [0KB], 256KB chunk
>
> md1 : inactive sda5[0] sdb5[1]
> 41945504 blocks super 1.0
>
> md0 : active raid1 sda1[0] sdb1[1]
> 104376 blocks super 1.0 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 8KB chunk
>
> unused devices: <none>
>
> nemtemp:~ # mdadm --stop /dev/md1
> mdadm: stopped /dev/md1
>
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
> 221929772 blocks super 1.0 [2/2] [UU]
> bitmap: 0/424 pages [0KB], 256KB chunk
>
> md0 : active raid1 sda1[0] sdb1[1]
> 104376 blocks super 1.0 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 8KB chunk
>
> unused devices: <none>
>
> nemtemp:~ # mdadm --verbose --assemble --force /dev/md1 /dev/sd[ab]5
> mdadm: looking for devices for /dev/md1
> mdadm: /dev/sda5 is identified as a member of /dev/md1, slot 0.
> mdadm: /dev/sdb5 is identified as a member of /dev/md1, slot 1.
> mdadm: added /dev/sdb5 to /dev/md1 as 1
> mdadm: added /dev/sda5 to /dev/md1 as 0
> mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
>
> The log from the start attempt:
>
> Dec 9 00:16:11 Rescue kernel: md: md1 stopped.
> Dec 9 00:16:11 Rescue kernel: md: bind<sdb5>
> Dec 9 00:16:11 Rescue kernel: md: bind<sda5>
> Dec 9 00:16:11 Rescue kernel: md: md1: raid array is not clean -- starting
> background reconstruction
> Dec 9 00:16:11 Rescue kernel: raid1: raid set md1 active with 2 out of 2 mirrors
> Dec 9 00:16:11 Rescue kernel: md1: bitmap file is out of date (148 < 149) --
> forcing full recovery
> Dec 9 00:16:11 Rescue kernel: md1: bitmap file is out of date, doing full recovery
> Dec 9 00:16:12 Rescue kernel: md1: bitmap initialisation failed: -5
> Dec 9 00:16:12 Rescue kernel: md1: failed to create bitmap (-5)
> Dec 9 00:16:12 Rescue kernel: md: pers->run() failed ...
>
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
> 221929772 blocks super 1.0 [2/2] [UU]
> bitmap: 0/424 pages [0KB], 256KB chunk
>
> md1 : inactive sda5[0] sdb5[1]
> 41945504 blocks super 1.0
>
> md0 : active raid1 sda1[0] sdb1[1]
> 104376 blocks super 1.0 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 8KB chunk
>
> unused devices: <none>
>
> I'm not sure how to proceed safely from here. Is there anything else I should
> try before attempting to --create the array again? If we do create the array
> with 1 drive and "missing", should I then use --add or --re-add to add the other
> drive? Also, since /dev/sda5 shows Events: 148 and /dev/sdb5 shows Events: 149,
> should I choose /dev/sdb5 as the one to preserve and let "missing" take the
> place of /dev/sda5? If so, then does the following create statement look correct:
>
> mdadm --create --verbose --level=1 --metadata=1.0 --raid-devices=2 \
> /dev/md1 /dev/sdb5 missing
>
> Should I also use --force?
>
> If attempting to assemble with "missing" and the create command gives problems
> due to the unused device still having the same minor-number, is it better to
> --zero-superblock the on the device not included as "missing" or is it better to
> just unplug it and preserve the superblock data in case it is needed?
>
> Sorry for all the questions, but I just want to make sure I don't do something
> to compromise the data. With the information for both drives looking good with
> --examine, the (Update Time : Tue Nov 19 15:28:38 2013) being identical, and the
> Events being off by only 1, I can't see a reason the drives should not just
> assemble and run as it is. What say the experts?
>
> Here is the --detail and --examine information for the drives for completeness:
>
> nemtemp:~ # mdadm --detail /dev/md1
> /dev/md1:
> Version : 01.00.03
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Used Dev Size : 20972752 (20.00 GiB 21.48 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 1
> Persistence : Superblock is persistent
>
> Update Time : Tue Nov 19 15:28:38 2013
> State : active, Not Started
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Name : 1
> UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Events : 148
>
> Number Major Minor RaidDevice State
> 0 8 5 0 active sync /dev/sda5
> 1 8 21 1 active sync /dev/sdb5
>
> nemtemp:/ # mdadm -E /dev/sda5
> /dev/sda5:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x1
> Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Name : 1
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Raid Devices : 2
>
> Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
> Array Size : 41945504 (20.00 GiB 21.48 GB)
> Super Offset : 41945632 sectors
> State : clean
> Device UUID : e0c1c580:db4d853e:6fac1c8f:fb5399d7
>
> Internal Bitmap : -81 sectors from superblock
> Update Time : Tue Nov 19 15:28:38 2013
> Checksum : d37d1086 - correct
> Events : 148
>
>
> Array Slot : 0 (0, 1)
> Array State : Uu
>
> nemtemp:/ # mdadm -E /dev/sdb5
> /dev/sdb5:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x1
> Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Name : 1
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Raid Devices : 2
>
> Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
> Array Size : 41945504 (20.00 GiB 21.48 GB)
> Super Offset : 41945632 sectors
> State : active
> Device UUID : 6edfa3f8:c8c4316d:66c19315:5eda0911
>
> Internal Bitmap : -81 sectors from superblock
> Update Time : Tue Nov 19 15:28:38 2013
> Checksum : 39ef40a5 - correct
> Events : 149
>
>
> Array Slot : 1 (0, 1)
> Array State : uU
>
>
>
What version of mdadm do you have? It looks like it should be cleverer than
it is.
What if you add "--update=no-bitmap" to the --assemble line?
As the bitmap seems to be causing problem, ignoring it might help.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-12-09 1:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-08 10:18 Raid1 where Event Count off my 1 cannot assemble --force David C. Rankin
2013-12-08 10:57 ` Mikael Abrahamsson
2013-12-08 17:57 ` David C. Rankin
2013-12-09 0:38 ` David C. Rankin
2013-12-09 0:52 ` Adam Goryachev
2013-12-09 2:38 ` David C. Rankin
2013-12-09 3:12 ` Adam Goryachev
2013-12-09 3:40 ` David C. Rankin
2013-12-09 1:00 ` NeilBrown [this message]
2013-12-09 4:28 ` David C. Rankin
2013-12-09 4:46 ` NeilBrown
2013-12-09 5:20 ` [SOLVED] " David C. Rankin
2013-12-09 5:40 ` NeilBrown
2013-12-09 7:40 ` Mikael Abrahamsson
2013-12-09 21:28 ` David C. Rankin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131209120040.6464b91b@notabene.brown \
--to=neilb@suse.de \
--cc=drankinatty@suddenlinkmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).