linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Homehost suddenly changed on some components
@ 2007-07-27 18:49 Max Amanshauser
  2007-07-30 19:01 ` Max Amanshauser
  0 siblings, 1 reply; 3+ messages in thread
From: Max Amanshauser @ 2007-07-27 18:49 UTC (permalink / raw)
  To: linux-raid

Hello!

After growing my RAID-5 from 8 to 10 devices, everything worked  
smoothly for a day. Then I rebooted into another kernel, still with  
the old mdadm.conf. The array did not assemble correctly, so I fixed  
mdadm.conf by correcting the number of devices. After rebooting once  
again four components had a different UUID, the homehost part is  
different. I am not aware that the machine's hostname has ever changed.

Six devices look like this: http://phpfi.com/252263
Four devices look like this: http://phpfi.com/252264

Since then assembling fails. Only six devices are added.

It's worth mentioning that the four rogue devices all are on the same  
controller, which worked flawlessly for two years.
lspci output: http://phpfi.com/252755


I am most concerned with repairing the UUID so the array assembles  
again. I tried several tricks with mdadm's --assemble --update flag,  
but to no avail. Everything I tried aborted with:

mdadm: superblock on /dev/hde1 doesn't match others - assembly aborted


Tried both mdadm v2.6.2 and v2.5.6. The system is an Ubuntu Feisty  
Server, growing was done with a custom 2.6.22 kernel, stock kernel is  
2.6.20.

Current mdadm.conf: http://phpfi.com/252752
Old one only differs in number of devices for md0.

-- 
Regards,
Max Amanshauser.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Homehost suddenly changed on some components
  2007-07-27 18:49 Homehost suddenly changed on some components Max Amanshauser
@ 2007-07-30 19:01 ` Max Amanshauser
  2007-08-01 12:29   ` Bill Davidsen
  0 siblings, 1 reply; 3+ messages in thread
From: Max Amanshauser @ 2007-07-30 19:01 UTC (permalink / raw)
  To: Max Amanshauser; +Cc: linux-raid

For the record:

After reading in the archives about similar problems, which were  
probably caused by something else but still close enough, I recreated  
the array with the exact same parameters from the superblock and one  
missing disk.

 > mdadm -C /dev/md0 -l 5 -n 10 -c 64 -p ls /dev/sdb1 /dev/sdd1 /dev/ 
sde1 /dev/hde1 /dev/hdb1 /dev/hdf1 /dev/hdh1 /dev/hdg1 /dev/sdc1 missing

Seems to have done the trick, fsck is working right now.


Funny things seem to happen to the superblocks more often than I  
thought. Recreating with one missing disk appears more like a hack  
than a solution to me. Maybe mdadm should have some kind of explicit  
superblock manipulation, like copying from other components or  
importing/exporting from/to a file, so such problems can be solved in  
a safe way?

Just a quick thought. :)


--
Regards,
Max Amanshauser


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Homehost suddenly changed on some components
  2007-07-30 19:01 ` Max Amanshauser
@ 2007-08-01 12:29   ` Bill Davidsen
  0 siblings, 0 replies; 3+ messages in thread
From: Bill Davidsen @ 2007-08-01 12:29 UTC (permalink / raw)
  To: Max Amanshauser; +Cc: linux-raid

Max Amanshauser wrote:
> For the record:
>
> After reading in the archives about similar problems, which were 
> probably caused by something else but still close enough, I recreated 
> the array with the exact same parameters from the superblock and one 
> missing disk.
>
> > mdadm -C /dev/md0 -l 5 -n 10 -c 64 -p ls /dev/sdb1 /dev/sdd1 
> /dev/sde1 /dev/hde1 /dev/hdb1 /dev/hdf1 /dev/hdh1 /dev/hdg1 /dev/sdc1 
> missing
>
> Seems to have done the trick, fsck is working right now.
>
>
> Funny things seem to happen to the superblocks more often than I 
> thought. Recreating with one missing disk appears more like a hack 
> than a solution to me. Maybe mdadm should have some kind of explicit 
> superblock manipulation, like copying from other components or 
> importing/exporting from/to a file, so such problems can be solved in 
> a safe way?

I have a feeling that Neil added a UUID reset capability someplace, 
maybe grow, because I was having a similar problem. Data would be on 
another machine I can't check from here, but that's my recollection. 
Could be wrong, obviously.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-08-01 12:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-27 18:49 Homehost suddenly changed on some components Max Amanshauser
2007-07-30 19:01 ` Max Amanshauser
2007-08-01 12:29   ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).