Re: Crooked raid

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Crooked raid
@ 2005-11-16 15:41 Andrew Burgess
  2005-11-16 16:39 ` Guillaume Filion
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Burgess @ 2005-11-16 15:41 UTC (permalink / raw)
  To: linux-raid

>> Look at:
>>   mdadm -E /dev/hdc
>> If it has a superblock, zero it with 'mdadm --zero-superblock /dev/hdc'
> > Same for hdg

>I did this, rebooted and the system wouldn't reboot. Yikes! I was 
>however able to boot with giving root=/dev/hdc2 to the kernel.

I didn't realize we were talking about your root filesystem else I would have
been more cautious (or maybe just kept my mouth shut). Glad you got it to boot!
Do you recall why it didn't want to boot? What are the kernel command line args?

Did you check for a bootup script somewhere screwing things up?
  egrep -i 'raid|mdadm' /etc/rc.d/* /etc/rc.d/init.d/* /etc/*

And did you already say that you don't have a mdadm.conf file?

You might need to pick which mirror (hdc2 or hdg2) you trust more as your root
filesystem (since they may be different now) and then start over and follow the
HowTo for making a normal root filesystem into a raided one.

But if you can't find out why the system is still looking at hdc then it might
all happen again...

>Here's the relevant part of dmesg:
>device-mapper: 4.1.1-ioctl (2004-04-07) initialised: dm-devel@redhat.com
>md: can not import ide/host0/bus1/target0/lun0/part2, has active inodes!

I guess this is because you booted from it as hdc2 so its busy and mdadm
won't try to use it?

Do you know why the system sometimes says ide/host0/bus1/target0/lun0/part2 and
sometimes says hd[cg]2 ? Its confusing...

>md0 : active raid1 ide/host2/bus1/target0/lun0/part2[0]
>       77508032 blocks [2/1] [U_]

>/dev/md0:
>    Raid Devices : 2
>   Total Devices : 2
>           State : active, degraded
>  Active Devices : 1
>Working Devices : 1
>  Failed Devices : 1
>        0      34        2        0      active sync   /dev/hdg2
>        1       0        0        1      faulty removed

>mdadm: No super block found on /dev/hdc (Expected magic a92b4efc, got 
>00000000)
>mdadm: No super block found on /dev/hdg (Expected magic a92b4efc, got 
>00000000)

Those two results are good

>/dev/hdc2:
..
>this     1      22        2        1      active sync   /dev/hdc2
>    0     0      34        2        0      active sync   /dev/hdg2
>    1     1      22        2        1      active sync   /dev/hdc2

>/dev/hdg2:
..
>this     0      34        2        0      active sync   /dev/hdg2
>    0     0      34        2        0      active sync   /dev/hdg2
>    1     1       0        0        1      faulty removed

I still don't see where hdc comes from. Sorry!

Any other raid superblocks around?
  for dev in /dev/hd? /dev/hd??; do mdadm -E $dev; done

And any raid autodetect partitions around?
  fdisk -l | grep raid

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid
  2005-11-16 15:41 Crooked raid Andrew Burgess
@ 2005-11-16 16:39 ` Guillaume Filion
  2005-11-16 23:55   ` Guillaume Filion
  0 siblings, 1 reply; 10+ messages in thread
From: Guillaume Filion @ 2005-11-16 16:39 UTC (permalink / raw)
  To: linux-raid

Andrew Burgess a écrit :
>>I did this, rebooted and the system wouldn't reboot. Yikes! I was 
>>however able to boot with giving root=/dev/hdc2 to the kernel.
> 
> I didn't realize we were talking about your root filesystem else I would have
> been more cautious (or maybe just kept my mouth shut). Glad you got it to boot!

Hey no problem there. I prefer having someone trying to help me and 
having problems than getting no help at all.

> Do you recall why it didn't want to boot? 

I don't remember the exact errors, I you need them, I can get them 
tonight when I get back home. From memory, I got the first error very 
early in the boot process -- before trying to load any md stuff -- the 
error was something like:
Unable to boot from "0900"

Then I tried with an older kernel. This older kernel is bigger and I 
suspect that it has the raid stuff compiled in. With this kernel, md 
loaded and then I got this error:
Can't find Superblock on disk..

I was finally able to boot with the older kernel by specifying boot=/dev/hd2

It looks to me that it's still trying to mount md0 from hdc rather than 
hdc2.

> What are the kernel command line args?

I don't think there's any. append= is commented out in lilo.conf

> Did you check for a bootup script somewhere screwing things up?
>   egrep -i 'raid|mdadm' /etc/rc.d/* /etc/rc.d/init.d/* /etc/*

There's /etc/init.d/mdadm-raid that starts "/sbin/mdadm -A -s -a" if 
/etc/mdadm/mdadm.conf exists -- and it does, see below.

There's /etc/init.d/raid2 which would start raids from /etc/raidtab but 
  there's no raidtab and /etc/default/raid2 says to disable this.

Other than that I have this in /etc/modules.conf :
root@ali:~# egrep -i 'raid|mdadm' /etc/modules.conf
### update-modules: start processing /etc/modutils/raidtools2
alias md-personality-2 raid0
alias md-personality-3 raid1
alias md-personality-4 raid5
### update-modules: end processing /etc/modutils/raidtools2

> And did you already say that you don't have a mdadm.conf file?

I have a /etc/mdadm/mdadm.conf file:
DEVICE /dev/hdc2 /dev/hdg2
ARRAY /dev/md0 level=raid1 num-devices=2 
UUID=b013e39b:ec629293:98df4657:97255939

> You might need to pick which mirror (hdc2 or hdg2) you trust more as your root
> filesystem (since they may be different now) and then start over and follow the
> HowTo for making a normal root filesystem into a raided one.
> 
> But if you can't find out why the system is still looking at hdc then it might
> all happen again...
> 
> 
>>Here's the relevant part of dmesg:
>>device-mapper: 4.1.1-ioctl (2004-04-07) initialised: dm-devel@redhat.com
>>md: can not import ide/host0/bus1/target0/lun0/part2, has active inodes!
> 
> I guess this is because you booted from it as hdc2 so its busy and mdadm
> won't try to use it?

Yeah, that makes sense.

> Do you know why the system sometimes says ide/host0/bus1/target0/lun0/part2 and
> sometimes says hd[cg]2 ? Its confusing...

No, that's something that confuses me too. I installed devfsd sometime 
in the past but deinstalled it because I didn't need it.

> Any other raid superblocks around?
>   for dev in /dev/hd? /dev/hd??; do mdadm -E $dev; done

root@ali:~# for dev in /dev/hd? /dev/hd??; do mdadm -E $dev; done 2>&1 | 
egrep -v 'cannot|small'
mdadm: No super block found on /dev/hdc (Expected magic a92b4efc, got 
00000000)
mdadm: No super block found on /dev/hde (Expected magic a92b4efc, got 
69686766)
mdadm: No super block found on /dev/hdg (Expected magic a92b4efc, got 
00000000)
mdadm: No super block found on /dev/hdc1 (Expected magic a92b4efc, got 
ffffffff)
/dev/hdc2:
           Magic : a92b4efc
         Version : 00.90.00
            UUID : b013e39b:ec629293:98df4657:97255939
   Creation Time : Wed Dec 29 21:32:26 2004
      Raid Level : raid1
    Raid Devices : 2
   Total Devices : 3
Preferred Minor : 0

     Update Time : Tue Nov 15 16:01:25 2005
           State : clean
  Active Devices : 2
Working Devices : 2
  Failed Devices : 1
   Spare Devices : 0
        Checksum : ff92c98b - correct
          Events : 0.103


       Number   Major   Minor   RaidDevice State
this     1      22        2        1      active sync   /dev/hdc2

    0     0      34        2        0      active sync   /dev/hdg2
    1     1      22        2        1      active sync   /dev/hdc2
mdadm: No super block found on /dev/hde1 (Expected magic a92b4efc, got 
00000000)
mdadm: No super block found on /dev/hdg1 (Expected magic a92b4efc, got 
c8938b73)
/dev/hdg2:
           Magic : a92b4efc
         Version : 00.90.00
            UUID : b013e39b:ec629293:98df4657:97255939
   Creation Time : Wed Dec 29 21:32:26 2004
      Raid Level : raid1
    Raid Devices : 2
   Total Devices : 2
Preferred Minor : 0

     Update Time : Tue Nov 15 17:29:57 2005
           State : active
  Active Devices : 1
Working Devices : 1
  Failed Devices : 1
   Spare Devices : 0
        Checksum : ff92de3d - correct
          Events : 0.104


       Number   Major   Minor   RaidDevice State
this     0      34        2        0      active sync   /dev/hdg2

    0     0      34        2        0      active sync   /dev/hdg2
    1     1       0        0        1      faulty removed



> And any raid autodetect partitions around?
>   fdisk -l | grep raid

fdisk -l doesn't output anything.

One thing that might be a clue about the problem is the warnings that I 
get when I run lilo -v:
root@ali:~# lilo -v
LILO version 22.6.1, Copyright (C) 1992-1998 Werner Almesberger
Development beyond version 21 Copyright (C) 1999-2004 John Coffman
Released 17-Nov-2004, and compiled at 20:01:15 on Sep 29 2005
Debian GNU/Linux

Reading boot sector from /dev/hde
Warning: '/proc/partitions' does not match '/dev' directory structure.
     Name change: '/dev/ide/host2/bus1/target0/lun0/disc' -> '/dev/hdg'
     Name change: '/dev/ide/host2/bus1/target0/lun0/part1' -> '/dev/hdg1'
Warning: Kernel & BIOS return differing head/sector geometries for 
device 0x81
     Kernel: 23989 cylinders, 16 heads, 63 sectors
       BIOS: 1024 cylinders, 255 heads, 63 sectors
     Name change: '/dev/ide/host2/bus1/target0/lun0/part2' -> '/dev/hdg2'
     Name change: '/dev/ide/host2/bus0/target0/lun0/disc' -> '/dev/hde'
     Name change: '/dev/ide/host2/bus0/target0/lun0/part1' -> '/dev/hde1'
     Name change: '/dev/ide/host0/bus1/target0/lun0/disc' -> '/dev/hdc'
     Name change: '/dev/ide/host0/bus1/target0/lun0/part1' -> '/dev/hdc1'
     Name change: '/dev/ide/host0/bus1/target0/lun0/part2' -> '/dev/hdc2'
     Name change: '/dev/md/0' -> '/dev/md0'
/boot/boot.1600 exists - no master disk volume ID record backup copy made.
Backup copy of master disk volume ID record in /boot/boot.2200
...

In case you need to see it, /proc/partitions looks like this:
root@ali:~# cat /proc/partitions
major minor  #blocks  name     rio rmerge rsect ruse wio wmerge wsect 
wuse running use aveq

   34     0   78150744 ide/host2/bus1/target0/lun0/disc 38 240 688 190 3 
0 12 0 -166 22379007 12182291
   34     1     642568 ide/host2/bus1/target0/lun0/part1 9 12 168 60 0 0 
0 0 0 60 60
   34     2   77508144 ide/host2/bus1/target0/lun0/part2 14 127 288 80 1 
0 8 0 0 80 80
   33     0     251392 ide/host2/bus0/target0/lun0/disc 400 713 2226 760 
4330 3059 14778 146790 0 128510 147550
   33     1     251632 ide/host2/bus0/target0/lun0/part1 387 626 2026 
720 4328 3059 14774 146710 0 128390 147430
   22     0   78150744 ide/host0/bus1/target0/lun0/disc 183212 108424 
2332636 2782770 503378 1212603 13756060 27924377 -167 22375017 20476641
   22     1     642568 ide/host0/bus1/target0/lun0/part1 62 104 1328 400 
287 2195 21104 13580 0 2430 13980
   22     2   77508144 ide/host0/bus1/target0/lun0/part2 183135 108219 
2331076 2782170 503089 1210408 13734952 27910797 0 2151990 30704027
    9     0   77508032 md/0 0 0 0 0 0 0 0 0 0 0 0

Thanks,
GFK's
-- 
Guillaume Filion, ing. jr
Logidac Tech., Beaumont, Québec, Canada - http://logidac.com/
PGP Key and more: http://guillaume.filion.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid
  2005-11-16 16:39 ` Guillaume Filion
@ 2005-11-16 23:55   ` Guillaume Filion
  2005-11-21  3:50     ` Crooked raid [solved] Guillaume Filion
  0 siblings, 1 reply; 10+ messages in thread
From: Guillaume Filion @ 2005-11-16 23:55 UTC (permalink / raw)
  To: linux-raid

Guillaume Filion a écrit :
> Andrew Burgess a écrit :
 >> Do you recall why it didn't want to boot?

I retried to boot and here's the error that I'm getting:
EXT3-fs: unable to read superblock
cramfs: wrong magic

>> Do you know why the system sometimes says 
>> ide/host0/bus1/target0/lun0/part2 and
>> sometimes says hd[cg]2 ? Its confusing...
> 
> No, that's something that confuses me too. I installed devfsd sometime 
> in the past but deinstalled it because I didn't need it.

I decided to resintall devfsd to correct that problem. I think it 
helped, at least I'm seeing ide/host0/... everywhere now.

One thing that's strange is that mdadm doesn't seem to recognise 
/dev/md0 (or /dev/md/0) but it's clearly mounted and working...

gfk@ali:~$ sudo mdadm -D /dev/md0
mdadm: md device /dev/md0 does not appear to be active.
gfk@ali:~$ mount
/dev/md/0 on / type ext3 (rw,errors=remount-ro)
/dev/ide/host2/bus0/target0/lun0/part1 on /boot type ext2 
(rw,errors=remount-ro)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
gfk@ali:~$ cat /proc/mdstat
Personalities : [raid1]
read_ahead not set
md0 : inactive ide/host2/bus1/target0/lun0/part2[0]
       0 blocks
unused devices: <none>

>> And any raid autodetect partitions around?
>>   fdisk -l | grep raid
> 
> fdisk -l doesn't output anything.

After reinstalling devfds, fdisk -l is working:
gfk@ali:~$ sudo fdisk -l | fgrep raid
/dev/ide/host2/bus1/target0/lun0/part2   *        1276      155061 
77508144   fd  Linux raid autodetect
/dev/ide/host0/bus1/target0/lun0/part2   *        1276      155061 
77508144   fd  Linux raid autodetect

-- 
Guillaume Filion, ing. jr
Logidac Tech., Beaumont, Québec, Canada - http://logidac.com/
PGP Key and more: http://guillaume.filion.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid [solved]
  2005-11-16 23:55   ` Guillaume Filion
@ 2005-11-21  3:50     ` Guillaume Filion
  0 siblings, 0 replies; 10+ messages in thread
From: Guillaume Filion @ 2005-11-21  3:50 UTC (permalink / raw)
  To: linux-raid

Hi all,

I finally solved my problem by booting with knoppix and recreating the 
raid from scratch.

 From memory, the process might have looked like that:
boot with knoppix as single user
mdadm --zero-superblock /dev/hdg2
mdadm --zero-superblock /dev/hdg
mdadm --zero-superblock /dev/hdc2
mdadm --zero-superblock /dev/hdc
mdadm --create /dev/md0 --level 1 --raid-devices=2 missing /dev/hdg2
reboot in single user mode with /dev/hdc2 as root
mdadm --assemble /dev/md0 /dev/hdg2
mount /dev/md0 /mnt/md0 [mkinitrd needs /dev/md0 to be mounted to 
create a initrd.img that will load the raid correctly]
mkinitrd -o /boot/initrd-raid.img
edit lilo.conf to make an entry with /dev/md0 as root and uses 
initrd-raid.img
lilo -v
reboot and test this new lilo entry.
if it works, reboot with knoppix as single user
mount /dev/hdc2 /mnt/hdc2
mount /dev/md0 /mnt/md0
rsync -auHx --progress --exclude=/mnt/hdc2/proc/ /mnt/hdc2/* /mnt/md0/
[wait about 5 hours....]
edit /mnt/md0/etc/fstab to make /dev/md0 as root
reboot with the lilo entry that mounts /dev/md0 as root
mdadm /dev/md0 -a /dev/hdc2
[wait a while until the raid is in sync]
mkinitrd -o /boot/initrd.img
reboot, to be sure that it works.

After playing a lot with initrd and lilo, I think that the reason that 
md0 was trying to boot from hdc was that it was configured that way in 
initrd.img. I've since overwritten that file, but if I had to redo it, 
I would take a look at it:
mount /boot/initrd.img /mnt/inirtd -o loop
cat /mnt/initrd/script
There should be a line in there that starts with:
mdadm -A /dev/md/0 ...

Anyway, I certainly learned a lot from that experience! :-)

Cheers,
GFK's
-- 
Guillaume Filion, ing. jr
Logidac Tech., Beaumont, Québec, Canada - http://logidac.com/
PGP Key and more: http://guillaume.filion.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid
@ 2005-11-16 16:07 Andrew Burgess
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Burgess @ 2005-11-16 16:07 UTC (permalink / raw)
  To: linux-raid

>I still don't see where hdc comes from. Sorry!

Maybe its in the kernel boot args? Look in /boot/grub/grub.conf
if you use grub. Or /etc/lilo.conf or...


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid
@ 2005-11-15 18:44 Andrew Burgess
  2005-11-15 23:24 ` Guillaume Filion
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Burgess @ 2005-11-15 18:44 UTC (permalink / raw)
  To: gfk; +Cc: linux-raid

>Some time ago, I wanted to setup a software RAID-1 between hdc2 and 
>hdg2. However, not being familiar with mdadm and software raid, I made a 
>couple of bad commands. I don't remember the exact commands, but it's in 
>  the order of setting /dev/hdc, /dev/hdc2, /dev/hdg and /dev/hdg2 in 
>the same RAID. 

Oops

>Because of this, I was getting all sorts of errors in my 
>dmesg (see dmesg-broken below), but otherwise the RAID would work fine.

>After setting my /etc/mdadm/mdadm.conf to:
>DEVICE /dev/hdc2 /dev/hdg2
>ARRAY /dev/md0 level=raid1 num-devices=2 
>UUID=b013e39b:ec629293:98df4657:97255939

>I tried to
>mdadm /dev/md0 --remove /dev/hdc
>but without success:
>mdadm: hot remove failed for /dev/hdc: No such device or address

Look at:
  mdadm -E /dev/hdc
If it has a superblock, zero it with 'mdadm --zero-superblock /dev/hdc'

Same for hdg

Just as a failsafe test, before you zero, compare mdadm -E for hdc2 and hdc,
they should print different things. You want to insure you don't zero hd[cg]2
accidently.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid
  2005-11-15 18:44 Andrew Burgess
@ 2005-11-15 23:24 ` Guillaume Filion
  0 siblings, 0 replies; 10+ messages in thread
From: Guillaume Filion @ 2005-11-15 23:24 UTC (permalink / raw)
  To: linux-raid

Thanks for your help Andrew. I'm not sure if I did something wrong, but 
I'm having some problems...

Le 05-11-15, à 13:44, Andrew Burgess a écrit :
> Look at:
>   mdadm -E /dev/hdc
> If it has a superblock, zero it with 'mdadm --zero-superblock /dev/hdc'
 > Same for hdg

I did this, rebooted and the system wouldn't reboot. Yikes! I was 
however able to boot with giving root=/dev/hdc2 to the kernel.
Do you have any idea about what's going on? Thanks.

BTW, I don't have a /dev/raidtab file.

Here's the relevant part of dmesg:
device-mapper: 4.1.1-ioctl (2004-04-07) initialised: dm-devel@redhat.com
md: can not import ide/host0/bus1/target0/lun0/part2, has active inodes!
md: md_import_device returned -16
  [events: 00000067]
md: bind<ide/host2/bus1/target0/lun0/part2,1>
md: ide/host2/bus1/target0/lun0/part2's event counter: 00000067
md0: former device ide/host0/bus1/target0/lun0/part2 is unavailable, 
removing from array!
md0: max total readahead window set to 124k
md0: 1 data-disks, max readahead per data-disk: 124k
raid1: device ide/host2/bus1/target0/lun0/part2 operational as mirror 0
raid1: md0, not all disks are operational -- trying to recover array
raid1: raid set md0 active with 1 out of 2 mirrors
md: updating md0 RAID superblock on device
md: ide/host2/bus1/target0/lun0/part2 [events: 00000068]<6>(write) 
ide/host2/bus1/target0/lun0/part2's sb offset: 77508032
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...


Here's the state of my system:
gfk@ali:~$ cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 ide/host2/bus1/target0/lun0/part2[0]
       77508032 blocks [2/1] [U_]

unused devices: <none>
gfk@ali:~$ sudo mdadm -D /dev/md0
Password:
/dev/md0:
         Version : 00.90.00
   Creation Time : Wed Dec 29 21:32:26 2004
      Raid Level : raid1
      Array Size : 77508032 (73.92 GiB 79.37 GB)
     Device Size : 77508032 (73.92 GiB 79.37 GB)
    Raid Devices : 2
   Total Devices : 2
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Tue Nov 15 17:29:57 2005
           State : active, degraded
  Active Devices : 1
Working Devices : 1
  Failed Devices : 1
   Spare Devices : 0

            UUID : b013e39b:ec629293:98df4657:97255939
          Events : 0.104

     Number   Major   Minor   RaidDevice State
        0      34        2        0      active sync   /dev/hdg2
        1       0        0        1      faulty removed
gfk@ali:~$ sudo mdadm -E /dev/hdc
mdadm: No super block found on /dev/hdc (Expected magic a92b4efc, got 
00000000)
gfk@ali:~$ sudo mdadm -E /dev/hdg
mdadm: No super block found on /dev/hdg (Expected magic a92b4efc, got 
00000000)
gfk@ali:~$ sudo mdadm -E /dev/hdc2
/dev/hdc2:
           Magic : a92b4efc
         Version : 00.90.00
            UUID : b013e39b:ec629293:98df4657:97255939
   Creation Time : Wed Dec 29 21:32:26 2004
      Raid Level : raid1
    Raid Devices : 2
   Total Devices : 3
Preferred Minor : 0

     Update Time : Tue Nov 15 16:01:25 2005
           State : clean
  Active Devices : 2
Working Devices : 2
  Failed Devices : 1
   Spare Devices : 0
        Checksum : ff92c98b - correct
          Events : 0.103


       Number   Major   Minor   RaidDevice State
this     1      22        2        1      active sync   /dev/hdc2

    0     0      34        2        0      active sync   /dev/hdg2
    1     1      22        2        1      active sync   /dev/hdc2
gfk@ali:~$ sudo mdadm -E /dev/hdg2
/dev/hdg2:
           Magic : a92b4efc
         Version : 00.90.00
            UUID : b013e39b:ec629293:98df4657:97255939
   Creation Time : Wed Dec 29 21:32:26 2004
      Raid Level : raid1
    Raid Devices : 2
   Total Devices : 2
Preferred Minor : 0

     Update Time : Tue Nov 15 17:29:57 2005
           State : active
  Active Devices : 1
Working Devices : 1
  Failed Devices : 1
   Spare Devices : 0
        Checksum : ff92de3d - correct
          Events : 0.104


       Number   Major   Minor   RaidDevice State
this     0      34        2        0      active sync   /dev/hdg2

    0     0      34        2        0      active sync   /dev/hdg2
    1     1       0        0        1      faulty removed

-- 
Guillaume Filion, ing. jr
Logidac Tech., Beaumont, Québec, Canada - http://logidac.com/
PGP Key and more: http://guillaume.filion.org/

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Crooked raid
@ 2005-11-15 15:20 Guillaume Filion
  2005-11-15 23:07 ` Neil Brown
  0 siblings, 1 reply; 10+ messages in thread
From: Guillaume Filion @ 2005-11-15 15:20 UTC (permalink / raw)
  To: linux-raid

Hi all,

Some time ago, I wanted to setup a software RAID-1 between hdc2 and 
hdg2. However, not being familiar with mdadm and software raid, I made a 
couple of bad commands. I don't remember the exact commands, but it's in 
  the order of setting /dev/hdc, /dev/hdc2, /dev/hdg and /dev/hdg2 in 
the same RAID. Because of this, I was getting all sorts of errors in my 
dmesg (see dmesg-broken below), but otherwise the RAID would work fine.

After setting my /etc/mdadm/mdadm.conf to:
DEVICE /dev/hdc2 /dev/hdg2
ARRAY /dev/md0 level=raid1 num-devices=2 
UUID=b013e39b:ec629293:98df4657:97255939

I don't get theses errors in my dmesg (see dmesg-now), but I can still 
see incoherent data when looking at md0 -- see the lines Total Devices 
and Failed Devices in mdadm below.

I tried to
mdadm /dev/md0 --remove /dev/hdc
but without success:
mdadm: hot remove failed for /dev/hdc: No such device or address

I'm wondering what would be the easiest way to correct this. If 
possible, I'd prefer not having to start from scratch.

I'm running kernel 2.4.27-2-k7 on debian testing and using mdadm v1.12.0.

Thanks in advance,
GFK's

========= dmesg-broken ===========
device-mapper: 4.1.1-ioctl (2004-04-07) initialised: dm-devel@redhat.com
VFS: Disk change detected on device 21:00
  /dev/ide/host2/bus0/target0/lun0: p1
VFS: Disk change detected on device 21:00
  /dev/ide/host2/bus0/target0/lun0: p1
  /dev/ide/host2/bus0/target0/lun0: p1
  /dev/ide/host2/bus0/target0/lun0: p1
VFS: Disk change detected on device 21:00
  /dev/ide/host2/bus0/target0/lun0: p1
VFS: Disk change detected on device 21:00
  /dev/ide/host2/bus0/target0/lun0: p1
  [events: 0000005d]
md: bind<ide/host0/bus1/target0/lun0/part2,1>
  [events: 00000027]
md0: WARNING: ide/host0/bus1/target0/lun0/disc appears to be on the same 
physical disk as ide/host0/bus1/target0/lun0/part2. True
      protection against single-disk failure might be compromised.
md: bind<ide/host0/bus1/target0/lun0/disc,2>
  [events: 0000002a]
md: bind<ide/host2/bus1/target0/lun0/disc,3>
  [events: 0000005d]
md0: WARNING: ide/host2/bus1/target0/lun0/part2 appears to be on the 
same physical disk as ide/host2/bus1/target0/lun0/disc. True
      protection against single-disk failure might be compromised.
md: bind<ide/host2/bus1/target0/lun0/part2,4>
md: ide/host2/bus1/target0/lun0/part2's event counter: 0000005d
md: ide/host2/bus1/target0/lun0/disc's event counter: 0000002a
md: ide/host0/bus1/target0/lun0/disc's event counter: 00000027
md: ide/host0/bus1/target0/lun0/part2's event counter: 0000005d
md: superblock update time inconsistency -- using the most recent one
md: freshest: ide/host2/bus1/target0/lun0/part2
md: kicking non-fresh ide/host2/bus1/target0/lun0/disc from array!
md: unbind<ide/host2/bus1/target0/lun0/disc,3>
md: export_rdev(ide/host2/bus1/target0/lun0/disc)
md: kicking non-fresh ide/host0/bus1/target0/lun0/disc from array!
md: unbind<ide/host0/bus1/target0/lun0/disc,2>
md: export_rdev(ide/host0/bus1/target0/lun0/disc)
md0: max total readahead window set to 124k
raid1: device ide/host2/bus1/target0/lun0/part2 operational as mirror 0
raid1: device ide/host0/bus1/target0/lun0/part2 operational as mirror 1
raid1: raid set md0 active with 2 out of 2 mirrors
md: updating md0 RAID superblock on device
md: ide/host2/bus1/target0/lun0/part2 [events: 0000005e]<6>(write) 
ide/host2/bus1/target0/lun0/part2's sb offset: 77508032
md: ide/host0/bus1/target0/lun0/part2 [events: 0000005e]<6>(write) 
ide/host0/bus1/target0/lun0/part2's sb offset: 77508032
========== dmesg-now =============
Partition check:
  /dev/ide/host0/bus1/target0/lun0: p1 p2
  /dev/ide/host2/bus0/target0/lun0: p1
  /dev/ide/host2/bus1/target0/lun0: p1 p2
  [events: 00000065]
md: bind<ide/host0/bus1/target0/lun0/part2,1>
  [events: 00000065]
md: bind<ide/host2/bus1/target0/lun0/part2,2>
md: ide/host2/bus1/target0/lun0/part2's event counter: 00000065
md: ide/host0/bus1/target0/lun0/part2's event counter: 00000065
md0: max total readahead window set to 124k
md0: 1 data-disks, max readahead per data-disk: 124k
raid1: device ide/host2/bus1/target0/lun0/part2 operational as mirror 0
raid1: device ide/host0/bus1/target0/lun0/part2 operational as mirror 1
raid1: raid set md0 active with 2 out of 2 mirrors
md: updating md0 RAID superblock on device
md: ide/host2/bus1/target0/lun0/part2 [events: 00000066]<6>(write) 
ide/host2/bus1/target0/lun0/part2's sb offset: 77508032
md: ide/host0/bus1/target0/lun0/part2 [events: 00000066]<6>(write) 
ide/host0/bus1/target0/lun0/part2's sb offset: 77508032
========== mdadm ==============
gfk@ali:~$ sudo mdadm --detail /dev/md0
/dev/md0:
         Version : 00.90.00
   Creation Time : Wed Dec 29 21:32:26 2004
      Raid Level : raid1
      Array Size : 77508032 (73.92 GiB 79.37 GB)
     Device Size : 77508032 (73.92 GiB 79.37 GB)
    Raid Devices : 2
   Total Devices : 3
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Mon Nov 14 19:17:30 2005
           State : active
  Active Devices : 2
Working Devices : 2
  Failed Devices : 1
   Spare Devices : 0

            UUID : b013e39b:ec629293:98df4657:97255939
          Events : 0.102

     Number   Major   Minor   RaidDevice State
        0      34        2        0      active sync   /dev/hdg2
        1      22        2        1      active sync   /dev/hdc2
gfk@ali:~$
=============================

-- 
Guillaume Filion, ing. jr
Logidac Tech., Beaumont, Québec, Canada - http://logidac.com/
PGP Key and more: http://guillaume.filion.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid
  2005-11-15 15:20 Guillaume Filion
@ 2005-11-15 23:07 ` Neil Brown
  2005-11-16 13:47   ` Guillaume Filion
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Brown @ 2005-11-15 23:07 UTC (permalink / raw)
  To: Guillaume Filion; +Cc: linux-raid

On Tuesday November 15, gfk@logidac.com wrote:
> 
> I'm wondering what would be the easiest way to correct this. If 
> possible, I'd prefer not having to start from scratch.
> 

It's not entirely clear to me what is happening,  In particular, why
md is tring to bind '..../disc,*' to an array.  Maybe something is
running 'raidstart'.  Do you have an '/etc/raidtab'?? If so, remove
it.

If md0 doesn't hold your root filesystem, try unmounting it, stopping
it with
  mdadm --stop /dev/md0
and restarting with
  mdadm --assemble /dev/md0 --update=summaries /dev/hd[gc]2

NeilBrown

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Crooked raid
  2005-11-15 23:07 ` Neil Brown
@ 2005-11-16 13:47   ` Guillaume Filion
  0 siblings, 0 replies; 10+ messages in thread
From: Guillaume Filion @ 2005-11-16 13:47 UTC (permalink / raw)
  To: linux-raid

Neil Brown a écrit :
> It's not entirely clear to me what is happening,  In particular, why
> md is tring to bind '..../disc,*' to an array.  

It's not clear to me either! :-)

> Maybe something is
> running 'raidstart'.  Do you have an '/etc/raidtab'?? If so, remove
> it.

No I don't have a /etc/raidtab

> If md0 doesn't hold your root filesystem, try unmounting it, stopping
> it with
>   mdadm --stop /dev/md0
> and restarting with
>   mdadm --assemble /dev/md0 --update=summaries /dev/hd[gc]2

Unfortunately, it does hold my root filesystem. Maibe, I could boot from 
a rescue floppy/CD with mdadm?

Thanks,
GFK's
-- 
Guillaume Filion, ing. jr
Logidac Tech., Beaumont, Québec, Canada - http://logidac.com/
PGP Key and more: http://guillaume.filion.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-11-21  3:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-16 15:41 Crooked raid Andrew Burgess
2005-11-16 16:39 ` Guillaume Filion
2005-11-16 23:55   ` Guillaume Filion
2005-11-21  3:50     ` Crooked raid [solved] Guillaume Filion
  -- strict thread matches above, loose matches on Subject: below --
2005-11-16 16:07 Crooked raid Andrew Burgess
2005-11-15 18:44 Andrew Burgess
2005-11-15 23:24 ` Guillaume Filion
2005-11-15 15:20 Guillaume Filion
2005-11-15 23:07 ` Neil Brown
2005-11-16 13:47   ` Guillaume Filion

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).