* Raid1 where Event Count off my 1 cannot assemble --force
@ 2013-12-08 10:18 David C. Rankin
2013-12-08 10:57 ` Mikael Abrahamsson
0 siblings, 1 reply; 15+ messages in thread
From: David C. Rankin @ 2013-12-08 10:18 UTC (permalink / raw)
To: linux-raid
Guys,
I have an older box that is a fax server where the Event Count for /dev/md1 is
off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
/dev/sda5 /dev/sdb5. Per the warnings in the wiki, I'm asking for help before I
go attempt to recreate the array and screw something up. Here is the relevant
information. The box is running openSuSE 11 (2.6.25) with mdraid 2.6.4. This box
has run flawlessly for years.
I have 3 mdraid partitions on this box:
/dev/md0 sda1/sdb1 /boot
/dev/md1 sda5/sdb5 /
/dev/md2 sda7/sdb7 /home
After booting the 11.0 install dvd and booting Recovery Console, mdraid found
and assembled all arrays, md0 and md2 are fine, its is just md1 that is the problem:
# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda7[0] sdb7[1]
221929772 blocks super 1.0 [2/2] [UU]
bitmap: 0/424 pages [0KB], 256KB chunk
md1 : inactive sda5[0] sdb5[1]
41945504 blocks super 1.0
md0 : active raid1 sda1[0] sdb1[1]
104376 blocks super 1.0 [2/2] [UU]
bitmap: 0/7 pages [0KB], 8KB chunk
The array information on disk for both disks (sda5/sdb5) shows the exact same
Update Time, (Tue Nov 19 15:28:38 2013) the only difference between the output
is the checksums (both shown correct) and the Events : 148/149 The full output
of mdadm --examine /dev/sd[ab]5 is here and in a (1.7 M screenshot) below:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
Name : 1
Creation Time : Thu Aug 21 06:43:22 2008
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 41945504 (20.00 GiG 21.48 GB)
Array Size : 41945504 (20.00 GiG 21.48 GB)
Super Offset : 41945632 sectors
State : clean
Device UUID : e8c1c580:db4d853e:6fac1c8f:fb5399d7
Internal Bitmap : -81 sectors from superblock
Update Time : Tue Nov 19 15:28:38 2013
checksum : d37d1086 - correct
Events : 148
Array Slot : 0 (0,1)
Array State : Uu
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
Name : 1
Creation Time : Thu Aug 21 06:43:22 2008
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 41945504 (20.00 GiG 21.48 GB)
Array Size : 41945504 (20.00 GiG 21.48 GB)
Super Offset : 41945632 sectors
State : clean
Device UUID : 6edfa3f8:c8c4316d:66c19315:5eda0911
Internal Bitmap : -81 sectors from superblock
Update Time : Tue Nov 19 15:28:38 2013
checksum : 39ef40a5 - correct
Events : 149
Array Slot : 1 (0,1)
Array State : uU
http://www.3111skyline.com/dl/screenshots/suse/mdadm-examine.jpg (1.7 Meg)
I have read through https://raid.wiki.kernel.org/index.php/RAID_Recovery and I
can confirm that mdadm --stop /dev/md1, stops the array and removes the device
for the information shown in cat /proc/mdstat. I have attempted to assemble and
force to get the array running but I am left with the same Input/Output error.
What does it look like the next proper course of action should be? I am new to
triaging non-working raid arrays, so all I can do is read. The next step appears
to be recreating the drives and hoping it all works. Am I at that "last resort"
yet? or are there a few more tricks to try. Thank you in advance for any help
you can give.
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-08 10:18 Raid1 where Event Count off my 1 cannot assemble --force David C. Rankin
@ 2013-12-08 10:57 ` Mikael Abrahamsson
2013-12-08 17:57 ` David C. Rankin
0 siblings, 1 reply; 15+ messages in thread
From: Mikael Abrahamsson @ 2013-12-08 10:57 UTC (permalink / raw)
To: David C. Rankin; +Cc: linux-raid
On Sun, 8 Dec 2013, David C. Rankin wrote:
> Guys,
>
> I have an older box that is a fax server where the Event Count for /dev/md1 is
> off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
> /dev/sda5 /dev/sdb5.
What are the messages displayed in "dmesg" when you try to use this
command?
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-08 10:57 ` Mikael Abrahamsson
@ 2013-12-08 17:57 ` David C. Rankin
2013-12-09 0:38 ` David C. Rankin
0 siblings, 1 reply; 15+ messages in thread
From: David C. Rankin @ 2013-12-08 17:57 UTC (permalink / raw)
To: mdraid
On 12/08/2013 04:57 AM, Mikael Abrahamsson wrote:
> On Sun, 8 Dec 2013, David C. Rankin wrote:
>
>> Guys,
>>
>> I have an older box that is a fax server where the Event Count for /dev/md1 is
>> off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
>> /dev/sda5 /dev/sdb5.
>
> What are the messages displayed in "dmesg" when you try to use this command?
>
Mikael,
Following the commands:
# mdadm --stop /dev/md1
# mdadm --assemble --force /dev/dm1 /dev/sd[ab]5
The messages captured in the logs are:
Rescue Kernel: md: md1: stopped.
Rescue Kernel: md: unbind<sda5>
Rescue Kernel: md: export_rdev(sda5)
Rescue Kernel: md: unbind<sdb5>
Rescue Kernel: md: export_rdev(sdb5)
Rescue Kernel: md: md1: stopped.
Rescue Kernel: md: md1 raid array is not clean -- starting background reconstruction
Rescue Kernel: md: raid1: raid set md1 active with 2 out of 2 mirrors
Rescue Kernel: md1: bitmap file is out of date (148 < 149) -- forcing full recovery
Rescue Kernel: md1: bitmap file is out of date, doing full recovery
Rescue Kernel: md1: bitmap initialisation failed: -5
Rescue Kernel: md1: failed to create bitmap (-5)
That's it for the log, then on the command line I have:
mdadm: failed to RUN_ARRAY /dev/md1: Input/Output error
What should I try next? Don't hesitate to ask if you need any additional
information, I'll provide whatever is necessary. Thanks.
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-08 17:57 ` David C. Rankin
@ 2013-12-09 0:38 ` David C. Rankin
2013-12-09 0:52 ` Adam Goryachev
2013-12-09 1:00 ` NeilBrown
0 siblings, 2 replies; 15+ messages in thread
From: David C. Rankin @ 2013-12-09 0:38 UTC (permalink / raw)
To: mdraid
On 12/08/2013 11:57 AM, David C. Rankin wrote:
> On 12/08/2013 04:57 AM, Mikael Abrahamsson wrote:
>> On Sun, 8 Dec 2013, David C. Rankin wrote:
>>
>>> Guys,
>>>
>>> I have an older box that is a fax server where the Event Count for /dev/md1 is
>>> off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
>>> /dev/sda5 /dev/sdb5.
>>
>> What are the messages displayed in "dmesg" when you try to use this command?
>>
>
> Mikael,
>
> Following the commands:
>
> # mdadm --stop /dev/md1
> # mdadm --assemble --force /dev/dm1 /dev/sd[ab]5
>
> The messages captured in the logs are:
>
> Rescue Kernel: md: md1: stopped.
> Rescue Kernel: md: unbind<sda5>
> Rescue Kernel: md: export_rdev(sda5)
> Rescue Kernel: md: unbind<sdb5>
> Rescue Kernel: md: export_rdev(sdb5)
> Rescue Kernel: md: md1: stopped.
> Rescue Kernel: md: md1 raid array is not clean -- starting background reconstruction
> Rescue Kernel: md: raid1: raid set md1 active with 2 out of 2 mirrors
> Rescue Kernel: md1: bitmap file is out of date (148 < 149) -- forcing full recovery
> Rescue Kernel: md1: bitmap file is out of date, doing full recovery
> Rescue Kernel: md1: bitmap initialisation failed: -5
> Rescue Kernel: md1: failed to create bitmap (-5)
>
>
> That's it for the log, then on the command line I have:
>
> mdadm: failed to RUN_ARRAY /dev/md1: Input/Output error
>
> What should I try next? Don't hesitate to ask if you need any additional
> information, I'll provide whatever is necessary. Thanks.
>
Here is additional information with --verbose given:
nemtemp:~ # cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda7[0] sdb7[1]
221929772 blocks super 1.0 [2/2] [UU]
bitmap: 0/424 pages [0KB], 256KB chunk
md1 : inactive sda5[0] sdb5[1]
41945504 blocks super 1.0
md0 : active raid1 sda1[0] sdb1[1]
104376 blocks super 1.0 [2/2] [UU]
bitmap: 0/7 pages [0KB], 8KB chunk
unused devices: <none>
nemtemp:~ # mdadm --stop /dev/md1
mdadm: stopped /dev/md1
nemtemp:~ # cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda7[0] sdb7[1]
221929772 blocks super 1.0 [2/2] [UU]
bitmap: 0/424 pages [0KB], 256KB chunk
md0 : active raid1 sda1[0] sdb1[1]
104376 blocks super 1.0 [2/2] [UU]
bitmap: 0/7 pages [0KB], 8KB chunk
unused devices: <none>
nemtemp:~ # mdadm --verbose --assemble --force /dev/md1 /dev/sd[ab]5
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda5 is identified as a member of /dev/md1, slot 0.
mdadm: /dev/sdb5 is identified as a member of /dev/md1, slot 1.
mdadm: added /dev/sdb5 to /dev/md1 as 1
mdadm: added /dev/sda5 to /dev/md1 as 0
mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
The log from the start attempt:
Dec 9 00:16:11 Rescue kernel: md: md1 stopped.
Dec 9 00:16:11 Rescue kernel: md: bind<sdb5>
Dec 9 00:16:11 Rescue kernel: md: bind<sda5>
Dec 9 00:16:11 Rescue kernel: md: md1: raid array is not clean -- starting
background reconstruction
Dec 9 00:16:11 Rescue kernel: raid1: raid set md1 active with 2 out of 2 mirrors
Dec 9 00:16:11 Rescue kernel: md1: bitmap file is out of date (148 < 149) --
forcing full recovery
Dec 9 00:16:11 Rescue kernel: md1: bitmap file is out of date, doing full recovery
Dec 9 00:16:12 Rescue kernel: md1: bitmap initialisation failed: -5
Dec 9 00:16:12 Rescue kernel: md1: failed to create bitmap (-5)
Dec 9 00:16:12 Rescue kernel: md: pers->run() failed ...
nemtemp:~ # cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda7[0] sdb7[1]
221929772 blocks super 1.0 [2/2] [UU]
bitmap: 0/424 pages [0KB], 256KB chunk
md1 : inactive sda5[0] sdb5[1]
41945504 blocks super 1.0
md0 : active raid1 sda1[0] sdb1[1]
104376 blocks super 1.0 [2/2] [UU]
bitmap: 0/7 pages [0KB], 8KB chunk
unused devices: <none>
I'm not sure how to proceed safely from here. Is there anything else I should
try before attempting to --create the array again? If we do create the array
with 1 drive and "missing", should I then use --add or --re-add to add the other
drive? Also, since /dev/sda5 shows Events: 148 and /dev/sdb5 shows Events: 149,
should I choose /dev/sdb5 as the one to preserve and let "missing" take the
place of /dev/sda5? If so, then does the following create statement look correct:
mdadm --create --verbose --level=1 --metadata=1.0 --raid-devices=2 \
/dev/md1 /dev/sdb5 missing
Should I also use --force?
If attempting to assemble with "missing" and the create command gives problems
due to the unused device still having the same minor-number, is it better to
--zero-superblock the on the device not included as "missing" or is it better to
just unplug it and preserve the superblock data in case it is needed?
Sorry for all the questions, but I just want to make sure I don't do something
to compromise the data. With the information for both drives looking good with
--examine, the (Update Time : Tue Nov 19 15:28:38 2013) being identical, and the
Events being off by only 1, I can't see a reason the drives should not just
assemble and run as it is. What say the experts?
Here is the --detail and --examine information for the drives for completeness:
nemtemp:~ # mdadm --detail /dev/md1
/dev/md1:
Version : 01.00.03
Creation Time : Thu Aug 21 06:43:22 2008
Raid Level : raid1
Used Dev Size : 20972752 (20.00 GiB 21.48 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Tue Nov 19 15:28:38 2013
State : active, Not Started
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : 1
UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
Events : 148
Number Major Minor RaidDevice State
0 8 5 0 active sync /dev/sda5
1 8 21 1 active sync /dev/sdb5
nemtemp:/ # mdadm -E /dev/sda5
/dev/sda5:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
Name : 1
Creation Time : Thu Aug 21 06:43:22 2008
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
Array Size : 41945504 (20.00 GiB 21.48 GB)
Super Offset : 41945632 sectors
State : clean
Device UUID : e0c1c580:db4d853e:6fac1c8f:fb5399d7
Internal Bitmap : -81 sectors from superblock
Update Time : Tue Nov 19 15:28:38 2013
Checksum : d37d1086 - correct
Events : 148
Array Slot : 0 (0, 1)
Array State : Uu
nemtemp:/ # mdadm -E /dev/sdb5
/dev/sdb5:
Magic : a92b4efc
Version : 1.0
Feature Map : 0x1
Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
Name : 1
Creation Time : Thu Aug 21 06:43:22 2008
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
Array Size : 41945504 (20.00 GiB 21.48 GB)
Super Offset : 41945632 sectors
State : active
Device UUID : 6edfa3f8:c8c4316d:66c19315:5eda0911
Internal Bitmap : -81 sectors from superblock
Update Time : Tue Nov 19 15:28:38 2013
Checksum : 39ef40a5 - correct
Events : 149
Array Slot : 1 (0, 1)
Array State : uU
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 0:38 ` David C. Rankin
@ 2013-12-09 0:52 ` Adam Goryachev
2013-12-09 2:38 ` David C. Rankin
2013-12-09 1:00 ` NeilBrown
1 sibling, 1 reply; 15+ messages in thread
From: Adam Goryachev @ 2013-12-09 0:52 UTC (permalink / raw)
To: David C. Rankin, mdraid
On 09/12/13 11:38, David C. Rankin wrote:
> On 12/08/2013 11:57 AM, David C. Rankin wrote:
>> On 12/08/2013 04:57 AM, Mikael Abrahamsson wrote:
>>> On Sun, 8 Dec 2013, David C. Rankin wrote:
>>>
>>>> Guys,
>>>>
>>>> I have an older box that is a fax server where the Event Count for /dev/md1 is
>>>> off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
>>>> /dev/sda5 /dev/sdb5.
> Here is additional information with --verbose given:
Have you tried this:
mdadm --verbose --assemble /dev/md1 /dev/sdb5
mdadm --manage /dev/md1 --run
Being raid1, you should be able to use only a single device....
BTW, chances to recover your data should be exceptional, as long as you
don't do anything too silly. You should even be able to mount the device
directly (read-only):
mount -o ro /dev/sdb5 /mnt
(Depending on the content is a filesystem).
Then you can just backup the data, create a new array, and restore the
data. Depending on data and size this might even be a better option...
> Here is the --detail and --examine information for the drives for completeness:
>
> nemtemp:~ # mdadm --detail /dev/md1
> /dev/md1:
> Version : 01.00.03
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Used Dev Size : 20972752 (20.00 GiB 21.48 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 1
> Persistence : Superblock is persistent
>
> Update Time : Tue Nov 19 15:28:38 2013
> State : active, Not Started
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Name : 1
> UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Events : 148
>
> Number Major Minor RaidDevice State
> 0 8 5 0 active sync /dev/sda5
> 1 8 21 1 active sync /dev/sdb5
>
> nemtemp:/ # mdadm -E /dev/sda5
> /dev/sda5:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x1
> Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Name : 1
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Raid Devices : 2
>
> Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
> Array Size : 41945504 (20.00 GiB 21.48 GB)
> Super Offset : 41945632 sectors
> State : clean
> Device UUID : e0c1c580:db4d853e:6fac1c8f:fb5399d7
>
> Internal Bitmap : -81 sectors from superblock
> Update Time : Tue Nov 19 15:28:38 2013
> Checksum : d37d1086 - correct
> Events : 148
>
>
> Array Slot : 0 (0, 1)
> Array State : Uu
>
> nemtemp:/ # mdadm -E /dev/sdb5
> /dev/sdb5:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x1
> Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Name : 1
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Raid Devices : 2
>
> Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
> Array Size : 41945504 (20.00 GiB 21.48 GB)
> Super Offset : 41945632 sectors
> State : active
> Device UUID : 6edfa3f8:c8c4316d:66c19315:5eda0911
>
> Internal Bitmap : -81 sectors from superblock
> Update Time : Tue Nov 19 15:28:38 2013
> Checksum : 39ef40a5 - correct
> Events : 149
>
>
> Array Slot : 1 (0, 1)
> Array State : uU
BTW, the bitmap location looks.... strange...
Regards,
Adam
--
Adam Goryachev Website Managers www.websitemanagers.com.au
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 0:38 ` David C. Rankin
2013-12-09 0:52 ` Adam Goryachev
@ 2013-12-09 1:00 ` NeilBrown
2013-12-09 4:28 ` David C. Rankin
1 sibling, 1 reply; 15+ messages in thread
From: NeilBrown @ 2013-12-09 1:00 UTC (permalink / raw)
To: David C. Rankin; +Cc: mdraid
[-- Attachment #1: Type: text/plain, Size: 8277 bytes --]
On Sun, 08 Dec 2013 18:38:58 -0600 "David C. Rankin"
<drankinatty@suddenlinkmail.com> wrote:
> On 12/08/2013 11:57 AM, David C. Rankin wrote:
> > On 12/08/2013 04:57 AM, Mikael Abrahamsson wrote:
> >> On Sun, 8 Dec 2013, David C. Rankin wrote:
> >>
> >>> Guys,
> >>>
> >>> I have an older box that is a fax server where the Event Count for /dev/md1 is
> >>> off by 1, but the array cannot be reassembled with --assemble --force /dev/dm1
> >>> /dev/sda5 /dev/sdb5.
> >>
> >> What are the messages displayed in "dmesg" when you try to use this command?
> >>
> >
> > Mikael,
> >
> > Following the commands:
> >
> > # mdadm --stop /dev/md1
> > # mdadm --assemble --force /dev/dm1 /dev/sd[ab]5
> >
> > The messages captured in the logs are:
> >
> > Rescue Kernel: md: md1: stopped.
> > Rescue Kernel: md: unbind<sda5>
> > Rescue Kernel: md: export_rdev(sda5)
> > Rescue Kernel: md: unbind<sdb5>
> > Rescue Kernel: md: export_rdev(sdb5)
> > Rescue Kernel: md: md1: stopped.
> > Rescue Kernel: md: md1 raid array is not clean -- starting background reconstruction
> > Rescue Kernel: md: raid1: raid set md1 active with 2 out of 2 mirrors
> > Rescue Kernel: md1: bitmap file is out of date (148 < 149) -- forcing full recovery
> > Rescue Kernel: md1: bitmap file is out of date, doing full recovery
> > Rescue Kernel: md1: bitmap initialisation failed: -5
> > Rescue Kernel: md1: failed to create bitmap (-5)
> >
> >
> > That's it for the log, then on the command line I have:
> >
> > mdadm: failed to RUN_ARRAY /dev/md1: Input/Output error
> >
> > What should I try next? Don't hesitate to ask if you need any additional
> > information, I'll provide whatever is necessary. Thanks.
> >
>
> Here is additional information with --verbose given:
>
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
> 221929772 blocks super 1.0 [2/2] [UU]
> bitmap: 0/424 pages [0KB], 256KB chunk
>
> md1 : inactive sda5[0] sdb5[1]
> 41945504 blocks super 1.0
>
> md0 : active raid1 sda1[0] sdb1[1]
> 104376 blocks super 1.0 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 8KB chunk
>
> unused devices: <none>
>
> nemtemp:~ # mdadm --stop /dev/md1
> mdadm: stopped /dev/md1
>
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
> 221929772 blocks super 1.0 [2/2] [UU]
> bitmap: 0/424 pages [0KB], 256KB chunk
>
> md0 : active raid1 sda1[0] sdb1[1]
> 104376 blocks super 1.0 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 8KB chunk
>
> unused devices: <none>
>
> nemtemp:~ # mdadm --verbose --assemble --force /dev/md1 /dev/sd[ab]5
> mdadm: looking for devices for /dev/md1
> mdadm: /dev/sda5 is identified as a member of /dev/md1, slot 0.
> mdadm: /dev/sdb5 is identified as a member of /dev/md1, slot 1.
> mdadm: added /dev/sdb5 to /dev/md1 as 1
> mdadm: added /dev/sda5 to /dev/md1 as 0
> mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
>
> The log from the start attempt:
>
> Dec 9 00:16:11 Rescue kernel: md: md1 stopped.
> Dec 9 00:16:11 Rescue kernel: md: bind<sdb5>
> Dec 9 00:16:11 Rescue kernel: md: bind<sda5>
> Dec 9 00:16:11 Rescue kernel: md: md1: raid array is not clean -- starting
> background reconstruction
> Dec 9 00:16:11 Rescue kernel: raid1: raid set md1 active with 2 out of 2 mirrors
> Dec 9 00:16:11 Rescue kernel: md1: bitmap file is out of date (148 < 149) --
> forcing full recovery
> Dec 9 00:16:11 Rescue kernel: md1: bitmap file is out of date, doing full recovery
> Dec 9 00:16:12 Rescue kernel: md1: bitmap initialisation failed: -5
> Dec 9 00:16:12 Rescue kernel: md1: failed to create bitmap (-5)
> Dec 9 00:16:12 Rescue kernel: md: pers->run() failed ...
>
> nemtemp:~ # cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda7[0] sdb7[1]
> 221929772 blocks super 1.0 [2/2] [UU]
> bitmap: 0/424 pages [0KB], 256KB chunk
>
> md1 : inactive sda5[0] sdb5[1]
> 41945504 blocks super 1.0
>
> md0 : active raid1 sda1[0] sdb1[1]
> 104376 blocks super 1.0 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 8KB chunk
>
> unused devices: <none>
>
> I'm not sure how to proceed safely from here. Is there anything else I should
> try before attempting to --create the array again? If we do create the array
> with 1 drive and "missing", should I then use --add or --re-add to add the other
> drive? Also, since /dev/sda5 shows Events: 148 and /dev/sdb5 shows Events: 149,
> should I choose /dev/sdb5 as the one to preserve and let "missing" take the
> place of /dev/sda5? If so, then does the following create statement look correct:
>
> mdadm --create --verbose --level=1 --metadata=1.0 --raid-devices=2 \
> /dev/md1 /dev/sdb5 missing
>
> Should I also use --force?
>
> If attempting to assemble with "missing" and the create command gives problems
> due to the unused device still having the same minor-number, is it better to
> --zero-superblock the on the device not included as "missing" or is it better to
> just unplug it and preserve the superblock data in case it is needed?
>
> Sorry for all the questions, but I just want to make sure I don't do something
> to compromise the data. With the information for both drives looking good with
> --examine, the (Update Time : Tue Nov 19 15:28:38 2013) being identical, and the
> Events being off by only 1, I can't see a reason the drives should not just
> assemble and run as it is. What say the experts?
>
> Here is the --detail and --examine information for the drives for completeness:
>
> nemtemp:~ # mdadm --detail /dev/md1
> /dev/md1:
> Version : 01.00.03
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Used Dev Size : 20972752 (20.00 GiB 21.48 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 1
> Persistence : Superblock is persistent
>
> Update Time : Tue Nov 19 15:28:38 2013
> State : active, Not Started
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Name : 1
> UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Events : 148
>
> Number Major Minor RaidDevice State
> 0 8 5 0 active sync /dev/sda5
> 1 8 21 1 active sync /dev/sdb5
>
> nemtemp:/ # mdadm -E /dev/sda5
> /dev/sda5:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x1
> Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Name : 1
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Raid Devices : 2
>
> Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
> Array Size : 41945504 (20.00 GiB 21.48 GB)
> Super Offset : 41945632 sectors
> State : clean
> Device UUID : e0c1c580:db4d853e:6fac1c8f:fb5399d7
>
> Internal Bitmap : -81 sectors from superblock
> Update Time : Tue Nov 19 15:28:38 2013
> Checksum : d37d1086 - correct
> Events : 148
>
>
> Array Slot : 0 (0, 1)
> Array State : Uu
>
> nemtemp:/ # mdadm -E /dev/sdb5
> /dev/sdb5:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x1
> Array UUID : e45cfbeb:77c2b93b:43d3d214:390d0f25
> Name : 1
> Creation Time : Thu Aug 21 06:43:22 2008
> Raid Level : raid1
> Raid Devices : 2
>
> Avail Dev Size : 41945504 (20.00 GiB 21.48 GB)
> Array Size : 41945504 (20.00 GiB 21.48 GB)
> Super Offset : 41945632 sectors
> State : active
> Device UUID : 6edfa3f8:c8c4316d:66c19315:5eda0911
>
> Internal Bitmap : -81 sectors from superblock
> Update Time : Tue Nov 19 15:28:38 2013
> Checksum : 39ef40a5 - correct
> Events : 149
>
>
> Array Slot : 1 (0, 1)
> Array State : uU
>
>
>
What version of mdadm do you have? It looks like it should be cleverer than
it is.
What if you add "--update=no-bitmap" to the --assemble line?
As the bitmap seems to be causing problem, ignoring it might help.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 0:52 ` Adam Goryachev
@ 2013-12-09 2:38 ` David C. Rankin
2013-12-09 3:12 ` Adam Goryachev
0 siblings, 1 reply; 15+ messages in thread
From: David C. Rankin @ 2013-12-09 2:38 UTC (permalink / raw)
To: mdraid
On 12/08/2013 06:52 PM, Adam Goryachev wrote:
> Have you tried this:
>
> mdadm --verbose --assemble /dev/md1 /dev/sdb5
> mdadm --manage /dev/md1 --run
>
> Being raid1, you should be able to use only a single device....
>
>
> BTW, chances to recover your data should be exceptional, as long as you don't do
> anything too silly. You should even be able to mount the device directly
> (read-only):
> mount -o ro /dev/sdb5 /mnt
>
> (Depending on the content is a filesystem).
> Then you can just backup the data, create a new array, and restore the data.
> Depending on data and size this might even be a better option...
Adam,
Thank you for your suggestions. Here is the output attempting what you
suggested. The version is old (2.6.4), it is on the openSuSE install DVD for 11.0:
nemtemp:/mnt # mdadm --verbose --assemble /dev/md1 /dev/sdb5
mdadm: looking for devices for /dev/md1
mdadm: /dev/sdb5 is identified as a member of /dev/md1, slot 1.
mdadm: no uptodate device for slot 0 of /dev/md1
mdadm: added /dev/sdb5 to /dev/md1 as 1
mdadm: /dev/md1 assembled from 1 drive - need all 2 to start it (use --run to
insist).
nemtemp:/mnt # cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda7[0] sdb7[1]
221929772 blocks super 1.0 [2/2] [UU]
bitmap: 0/424 pages [0KB], 256KB chunk
md1 : inactive sdb5[1](S)
20972752 blocks super 1.0
md0 : active raid1 sda1[0] sdb1[1]
104376 blocks super 1.0 [2/2] [UU]
bitmap: 0/7 pages [0KB], 8KB chunk
unused devices: <none>
nemtemp:/mnt # mdadm --run --verbose /dev/md1
mdadm: failed to run array /dev/md1: Input/output error
nemtemp:/mnt # tail /var/log/messages
Dec 9 02:30:20 Rescue kernel: raid1: raid set md1 active with 1 out of 2 mirrors
Dec 9 02:30:20 Rescue kernel: md1: bitmap file is out of date (148 < 149) --
forcing full recovery
Dec 9 02:30:20 Rescue kernel: md1: bitmap file is out of date, doing full recovery
Dec 9 02:30:20 Rescue kernel: md1: bitmap initialisation failed: -5
Dec 9 02:30:20 Rescue kernel: md1: failed to create bitmap (-5)
Dec 9 02:30:20 Rescue kernel: md: pers->run() failed ...
I could boot with a newer recovery disk and see if a newer version of mdadm
would do things differently. I was using the original just to make sure I didn't
outsmart myself using a never version of mdadm than I would be running with
after repairing /dev/md1
Hooray! I can mount the thing as -t ext3:
mdadm: stopped /dev/md1
nemtemp:/mnt # mount -o ro /dev/sdb5 /mnt/sdb/
mount: unknown filesystem type 'linux_raid_member'
nemtemp:/mnt # mount -t ext3 -o ro /dev/sdb5 /mnt/sdb/
nemtemp:/mnt # l sdb
total 116
drwxr-xr-x 21 root root 4096 2013-01-25 17:06 ./
drwxr-xr-x 7 root root 140 2013-12-08 06:38 ../
drwxr-xr-x 2 root root 4096 2010-12-05 06:43 bin/
drwxr-xr-x 2 root root 4096 2008-08-21 06:48 boot/
drwxr-xr-x 2 root root 4096 2008-08-22 01:54 data/
drwxr-xr-x 5 root root 4096 2008-08-21 06:48 dev/
drwxr-xr-x 129 root root 12288 2013-11-16 13:23 etc/
drwxr-xr-x 2 root root 4096 2008-08-21 06:48 home/
drwxr-xr-x 14 root root 12288 2011-01-14 20:13 lib/
drwx------ 2 root root 16384 2008-08-21 06:43 lost+found/
drwxr-xr-x 2 root root 4096 2009-07-28 22:39 media/
drwxr-xr-x 8 root root 4096 2010-12-21 18:05 mnt/
drwxr-xr-x 5 root root 4096 2008-07-03 21:16 opt/
drwxr-xr-x 3 root root 4096 2008-08-21 06:48 proc/
drwx------ 24 root root 4096 2013-10-01 20:58 root/
drwxr-xr-x 3 root root 12288 2010-12-27 23:15 sbin/
drwxr-xr-x 4 root root 4096 2008-09-11 07:26 srv/
drwxr-xr-x 3 root root 4096 2008-08-21 06:48 sys/
drwxrwxrwt 7 root root 4096 2013-11-19 15:15 tmp/
drwxr-xr-x 12 root root 4096 2010-01-24 01:41 usr/
drwxr-xr-x 15 root root 4096 2009-07-02 06:37 var/
Now since I can mount it, how in the heck do I get the raid put back together.
Seems really simple, but I'm stuck... Try with a newer mdadm?
> BTW, the bitmap location looks.... strange...
I thought so too, but checking the other arrays, /dev/md2 has a negative
number as well:
nemtemp:/mnt # mdadm -E /dev/sda7
/dev/sda7:
<snip>
Internal Bitmap : -213 sectors from superblock
Update Time : Mon Dec 9 02:14:18 2013
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 2:38 ` David C. Rankin
@ 2013-12-09 3:12 ` Adam Goryachev
2013-12-09 3:40 ` David C. Rankin
0 siblings, 1 reply; 15+ messages in thread
From: Adam Goryachev @ 2013-12-09 3:12 UTC (permalink / raw)
To: David C. Rankin, mdraid
On 09/12/13 13:38, David C. Rankin wrote:
> On 12/08/2013 06:52 PM, Adam Goryachev wrote:
>> Have you tried this:
>>
>> mdadm --verbose --assemble /dev/md1 /dev/sdb5
>> mdadm --manage /dev/md1 --run
>>
>> Being raid1, you should be able to use only a single device....
>>
>>
>> BTW, chances to recover your data should be exceptional, as long as you don't do
>> anything too silly. You should even be able to mount the device directly
>> (read-only):
>> mount -o ro /dev/sdb5 /mnt
>>
>> (Depending on the content is a filesystem).
>> Then you can just backup the data, create a new array, and restore the data.
>> Depending on data and size this might even be a better option...
> nemtemp:/mnt # mount -o ro /dev/sdb5 /mnt/sdb/
> mount: unknown filesystem type 'linux_raid_member'
> nemtemp:/mnt # mount -t ext3 -o ro /dev/sdb5 /mnt/sdb/
>
>
> Now since I can mount it, how in the heck do I get the raid put back together.
> Seems really simple, but I'm stuck... Try with a newer mdadm?
>
Probably the best option is to follow Neil's advise to use mdadm from
git....
The alternative as I mentioned is to backup the data, re-create the raid
+ filesystem, and then restore the data.
>> BTW, the bitmap location looks.... strange...
> I thought so too, but checking the other arrays, /dev/md2 has a negative
> number as well:
>
> nemtemp:/mnt # mdadm -E /dev/sda7
> /dev/sda7:
> <snip>
> Internal Bitmap : -213 sectors from superblock
> Update Time : Mon Dec 9 02:14:18 2013
It looks strange when I first saw it, but now that I think about it, it
is probably right (correct) since 1.0 metadata is at the very end of the
drive, so the bitmap is probably before the metadata, hence negative offset.
--
Adam Goryachev Website Managers www.websitemanagers.com.au
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 3:12 ` Adam Goryachev
@ 2013-12-09 3:40 ` David C. Rankin
0 siblings, 0 replies; 15+ messages in thread
From: David C. Rankin @ 2013-12-09 3:40 UTC (permalink / raw)
To: mdraid
On 12/08/2013 09:12 PM, Adam Goryachev wrote:
> Probably the best option is to follow Neil's advise to use mdadm from git....
>
> The alternative as I mentioned is to backup the data, re-create the raid +
> filesystem, and then restore the data.
>>> BTW, the bitmap location looks.... strange...
>> I thought so too, but checking the other arrays, /dev/md2 has a negative
>> number as well:
>>
>> nemtemp:/mnt # mdadm -E /dev/sda7
>> /dev/sda7:
>> <snip>
>> Internal Bitmap : -213 sectors from superblock
>> Update Time : Mon Dec 9 02:14:18 2013
> It looks strange when I first saw it, but now that I think about it, it is
> probably right (correct) since 1.0 metadata is at the very end of the drive, so
> the bitmap is probably before the metadata, hence negative offset.
>
I have an install cd with mdadm 3.3.2 on it, I'll give that a go and see what it
does with the array. The partition is only 20G, so I can just copy it to a new
drive to backup. It almost seems like I should be able to change the Events
number with a low level tool on one drive and see if that would fix the problem.
I suspect the problem is with the older mdadm. Searching, there were several
posts very similar to mine in the 2008/2009 time frame. The older mdadm probably
does not handle recovery very well.
If I do get the drive to assemble/run under mdadm 3.3.2, then is there anything
I need to do before shutting down the box to insure it will work again under 2.6
so I can at least boot it before updating mdadm? If it assembles under 3.3.2,
then I should be able to assemble/run it under 2.6.4 since the Event count would
match -- right?
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 1:00 ` NeilBrown
@ 2013-12-09 4:28 ` David C. Rankin
2013-12-09 4:46 ` NeilBrown
0 siblings, 1 reply; 15+ messages in thread
From: David C. Rankin @ 2013-12-09 4:28 UTC (permalink / raw)
To: mdraid
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 12/08/2013 07:00 PM, NeilBrown wrote:
> What version of mdadm do you have? It looks like it should be cleverer than
> it is.
>
> What if you add "--update=no-bitmap" to the --assemble line? As the bitmap
> seems to be causing problem, ignoring it might help.
>
> NeilBrown
I tried that too Neil, it said:
nemtemp:/mnt # mdadm --verbose --assemble --force --update=no-bitmap /dev/md1
/dev/sd[ab]5
mdadm: '--update=no-bitmap' is invalid. Valid --update options are:
'sparc2.2', 'super-minor', 'uuid', 'name', 'resync',
'summaries', 'homehost', 'byteorder', 'devicesize'.
Like in my reply to Adam, I think the age of mdadm may be the issue. The
openSuSE 11.0 install DVD is pretty old ;-) It just may not be handling the
- --assemble correctly given the Event count difference, and there doesn't seem to
be a way to just tell it:
"The data is correct on both disks, just assemble and run it and be quiet!"
I'll try booting an Arch install CD with 3.3.2 on it and report back. Thank you
and Adam for your help.
If it does assemble with mdadm 3.3.2, anything I need to do on the assembled
array to make sure it stays that way? fsck? Any way to tell it to make sure it
is in sync so it will boot under the openSuSE 11.0 version of mdadm?
- --
David C. Rankin, J.D.,P.E.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlKlRtsACgkQZMpuZ8CyrcjDdACeIEoiMdNGQw0IEqg0LlgJ8f2t
vHwAnjLKHnTP1FpEpXwjRZxu+hoAntQV
=iTBQ
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 4:28 ` David C. Rankin
@ 2013-12-09 4:46 ` NeilBrown
2013-12-09 5:20 ` [SOLVED] " David C. Rankin
0 siblings, 1 reply; 15+ messages in thread
From: NeilBrown @ 2013-12-09 4:46 UTC (permalink / raw)
To: David C. Rankin; +Cc: mdraid
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Sun, 08 Dec 2013 22:28:11 -0600 "David C. Rankin"
<drankinatty@suddenlinkmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 12/08/2013 07:00 PM, NeilBrown wrote:
> > What version of mdadm do you have? It looks like it should be cleverer than
> > it is.
> >
> > What if you add "--update=no-bitmap" to the --assemble line? As the bitmap
> > seems to be causing problem, ignoring it might help.
> >
> > NeilBrown
>
> I tried that too Neil, it said:
>
> nemtemp:/mnt # mdadm --verbose --assemble --force --update=no-bitmap /dev/md1
> /dev/sd[ab]5
> mdadm: '--update=no-bitmap' is invalid. Valid --update options are:
> 'sparc2.2', 'super-minor', 'uuid', 'name', 'resync',
> 'summaries', 'homehost', 'byteorder', 'devicesize'.
>
> Like in my reply to Adam, I think the age of mdadm may be the issue. The
> openSuSE 11.0 install DVD is pretty old ;-) It just may not be handling the
> - --assemble correctly given the Event count difference, and there doesn't seem to
> be a way to just tell it:
>
> "The data is correct on both disks, just assemble and run it and be quiet!"
>
> I'll try booting an Arch install CD with 3.3.2 on it and report back. Thank you
> and Adam for your help.
>
> If it does assemble with mdadm 3.3.2, anything I need to do on the assembled
> array to make sure it stays that way? fsck? Any way to tell it to make sure it
> is in sync so it will boot under the openSuSE 11.0 version of mdadm?
>
Once you get any kernel to successfully assemble the array, any other
kernel/mdadm should be able to as well.
NeilBrown
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIVAwUBUqVLLznsnt1WYoG5AQJ5mg//UkmhkCAtbiCf7z+1X9i/HsoROqH3Dhn6
ii2GEkWcb+IL7tYJ3zp0uzXPJOAuU10SSF7Icrobp9ABnWsCHefEXgGt4kRB7dG8
VkIrYIEArmSkLXkRlSP76IWi2A/AZzhBC3/8QPVXVDBy2I93YHwbkIttzLJURc8R
/gtZRX7Bli4FgdNQuRj9K3KLc7Rar+CG1PmzLSInv3bXhZKM2JjlT2Wz4C0niBrf
0topMmjTrVESf8qT2QlOAvhN8BPNtWR95/H6W+v0nKU1KnyOuo6hwolqOFmspyJv
Hrn2PFdNKmYFfQ+rFOMZQk4mxII9OxLp5bidE7r4jedzyhQi6QKDLcXZc550irxI
e2vGu1Yg+V8d6dcRYgePChDIJn0c13iYgxZHxOJ4I1uhwC9dHNwgN8v377nkfZi+
wBml7Do2oynzImeCC7FVBnYrjmJm1M5djUqB12dIvQZySfI/SUMIVLPNyupiaSvb
xvZ/QzoJaowYo1a0lt5EmWAVD5FZ/YTjiffxCdRr/P1rVWfc7Jx2KE/i42heXo8e
zBU51hBrWI3JVLYMQr4/Rc1NYXtE51Cq0QzzM5dlnb+ya1m2boscYyUCZHjMGsJg
9JUnT6u7U8a676SZqCCUudIVPFxIR+xyhgpw0AmHcpfMdJ18ic/NmWXIQR7j+jka
e5LiyQVu5nw=
=nGpK
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 15+ messages in thread
* [SOLVED] Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 4:46 ` NeilBrown
@ 2013-12-09 5:20 ` David C. Rankin
2013-12-09 5:40 ` NeilBrown
0 siblings, 1 reply; 15+ messages in thread
From: David C. Rankin @ 2013-12-09 5:20 UTC (permalink / raw)
To: mdraid
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 12/08/2013 10:46 PM, NeilBrown wrote:
> Once you get any kernel to successfully assemble the array, any other
> kernel/mdadm should be able to as well.
>
> NeilBrown
After 2 days pulling my hair out, the answer was a simple as popping in the
Arch install CD and rebooting!!
After it booted, I checked cat /proc/mdstat and /dev/md1 (md125) on Arch is
active and re-syncing. Perfect.
That was a whole lot of work to find out that it was an old mdadm causing all
the problems -- live and learn...
Perhaps we should add "get the latest release of mdadm" to the Raid Recovery
page with the explanation that sometimes older versions of mdadm just do not
know how to handle simple issues that arise, such as differing Event counts.
Using the latest mdadm can solve those problems immediately.
- --
David C. Rankin, J.D.,P.E.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlKlUxsACgkQZMpuZ8CyrchuWgCfR+Lv97xnyE75Yp3iTCfEq7aC
+f8An2d6+MOHw0Bki+kg5IvOIxC7yYsr
=T4Hs
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [SOLVED] Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 5:20 ` [SOLVED] " David C. Rankin
@ 2013-12-09 5:40 ` NeilBrown
2013-12-09 7:40 ` Mikael Abrahamsson
0 siblings, 1 reply; 15+ messages in thread
From: NeilBrown @ 2013-12-09 5:40 UTC (permalink / raw)
To: David C. Rankin; +Cc: mdraid
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Sun, 08 Dec 2013 23:20:27 -0600 "David C. Rankin"
<drankinatty@suddenlinkmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 12/08/2013 10:46 PM, NeilBrown wrote:
> > Once you get any kernel to successfully assemble the array, any other
> > kernel/mdadm should be able to as well.
> >
> > NeilBrown
>
> After 2 days pulling my hair out, the answer was a simple as popping in the
> Arch install CD and rebooting!!
>
> After it booted, I checked cat /proc/mdstat and /dev/md1 (md125) on Arch is
> active and re-syncing. Perfect.
>
> That was a whole lot of work to find out that it was an old mdadm causing all
> the problems -- live and learn...
>
> Perhaps we should add "get the latest release of mdadm" to the Raid Recovery
> page with the explanation that sometimes older versions of mdadm just do not
> know how to handle simple issues that arise, such as differing Event counts.
> Using the latest mdadm can solve those problems immediately.
Glad that it is now working for you!
Isn't "Make sure you have latest release" always the second thing to check
(after checking that you are actually using the command properly:-)??
But yes, it doesn't hurt to add it to a wiki page somewhere.
NeilBrown
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIVAwUBUqVXtTnsnt1WYoG5AQLG/BAApvOlBcyQ199ZiRoh2IYz3IgZg9ovmoTn
zdWNN5frwFx4mUmq1+eUpKpn7XQ3XADBIbF82/1vSa0GQ2SfFkpUbJQycaxXgky2
Bf6JtQ6XFBmbgu1XYETFJfBRV9m9Gk10bKsGeNtqvRpfelqJ4ucIyFMPwktIXnON
Xe6ailKNr62fRCGmE3fxjWujMgiJjy/Q3tTbAOZvZ+8TLFbZSdFRlG5gIQSqVZP2
O3ZGkvgFYVseTIOMVzbeQg6VxRdTaWbEZ3xpyIa47/l6/h0ihnldk4NUC3bwDT1D
lxFLkKcgFgrPargjQ0wsSvx7/QC4pbk0cuKOk8Y1Y7O/eJny028v0tEzarXuZbgp
eDYtXht6lESsNRiyAkTqbeu5VydWYsGRUipmg0L4QtmP3TPjfWGYHZZxU+tPBfns
q2QvaUQpLIjFpr+y3C2k75xKIuoZEwReNMSJGvI/DB5CD8WqJIVqIy8LsIcHujxP
LAm9OCRnFNylVedEcAeHUju1Qyir0vPcs3wRrEJIv0yPR6ObfObFPtpzS0zYyCZm
YcmI63a6DS2q7WcGluYSwOgMu1tmmdU2/sifWE7+JZh6PI56Ilt3+RRfQRObzyk8
8/7o9O4VSVB6BSEyFqthuC16N/wiSSACPSqP2Li1IDJ/tPUv5WPc9FCKG6BYlkvF
SFLL7BeZygc=
=KkUZ
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [SOLVED] Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 5:40 ` NeilBrown
@ 2013-12-09 7:40 ` Mikael Abrahamsson
2013-12-09 21:28 ` David C. Rankin
0 siblings, 1 reply; 15+ messages in thread
From: Mikael Abrahamsson @ 2013-12-09 7:40 UTC (permalink / raw)
To: NeilBrown; +Cc: David C. Rankin, mdraid
On Mon, 9 Dec 2013, NeilBrown wrote:
> But yes, it doesn't hurt to add it to a wiki page somewhere.
I added this text to the wiki:
"There are bugs in older versions of mdadm, and a lot of "stable"
operating system releases ship with really old mdadm versions. Recent
versions are 3.2.x and 3.3.x. It's advisable if you run into problems
assembling your raid to upgrade to the latest git version of mdadm. If you
can get the raid to successfully assemble and recover with the git
version, then it's fine to use your old mdadm version again for normal
system operation. Newer mdadm doesn't make any changes to the array that
isn't backwards compatible."
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [SOLVED] Re: Raid1 where Event Count off my 1 cannot assemble --force
2013-12-09 7:40 ` Mikael Abrahamsson
@ 2013-12-09 21:28 ` David C. Rankin
0 siblings, 0 replies; 15+ messages in thread
From: David C. Rankin @ 2013-12-09 21:28 UTC (permalink / raw)
To: mdraid
On 12/09/2013 01:40 AM, Mikael Abrahamsson wrote:
>> But yes, it doesn't hurt to add it to a wiki page somewhere.
>
> I added this text to the wiki:
>
> "There are bugs in older versions of mdadm, and a lot of "stable" operating
> system releases ship with really old mdadm versions. Recent versions are 3.2.x
> and 3.3.x. It's advisable if you run into problems assembling your raid to
> upgrade to the latest git version of mdadm. If you can get the raid to
> successfully assemble and recover with the git version, then it's fine to use
> your old mdadm version again for normal system operation. Newer mdadm doesn't
> make any changes to the array that isn't backwards compatible."
That ought to help the dummies out...
As follow-up and confirmation of the help statement. I can confirm that after
repairing my arrays under mdadm 3.3.2, I was able to reboot and have the arrays
assemble and run with mdadm 2.6.4 without any issue. (good thing since the
problem was on the / partitions...)
Now I just have to back-port mdadm-master to build and run on openSuSE 11.0
when the closest thing I can find is an 11.1 src.rpm. (should be close enough)
Thank you again Neil and Mikael for all the help.
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-12-09 21:28 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-08 10:18 Raid1 where Event Count off my 1 cannot assemble --force David C. Rankin
2013-12-08 10:57 ` Mikael Abrahamsson
2013-12-08 17:57 ` David C. Rankin
2013-12-09 0:38 ` David C. Rankin
2013-12-09 0:52 ` Adam Goryachev
2013-12-09 2:38 ` David C. Rankin
2013-12-09 3:12 ` Adam Goryachev
2013-12-09 3:40 ` David C. Rankin
2013-12-09 1:00 ` NeilBrown
2013-12-09 4:28 ` David C. Rankin
2013-12-09 4:46 ` NeilBrown
2013-12-09 5:20 ` [SOLVED] " David C. Rankin
2013-12-09 5:40 ` NeilBrown
2013-12-09 7:40 ` Mikael Abrahamsson
2013-12-09 21:28 ` David C. Rankin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).