* Corrupted ext4 filesystem after mdadm manipulation error @ 2014-04-24 5:05 L.M.J 2014-04-24 17:48 ` L.M.J 0 siblings, 1 reply; 19+ messages in thread From: L.M.J @ 2014-04-24 5:05 UTC (permalink / raw) To: linux-raid Hi, For the third time, I had to change a failed drive from my home linux RAID5 box. Previous one went right and this time, I don't know what I did wrong, but I broke my RAID5. Well, at least, he didn't want to start. /dev/sdb was the failed drive /dev/sdc and /dev/sdd are OK. I tried to reassemble the RAID with this command after I replace sdb and create a new partition : ~# mdadm -Cv /dev/md0 --assume-clean --level=5 --raid-devices=3 /dev/sdc1 /dev/sdd1 /dev/sdb1 -> '-C' was not a good idea here Well, I guess I did an another mistake here, I should have done this instead : ~# mdadm -Av /dev/md0 --assume-clean --level=5 --raid-devices=3 /dev/sdc1 /dev/sdd1 missing Maybe this wipe out my data... Let's go futher, then, pvdisplay, pvscan, vgdisplay returns empty information Google helped me, and I did this : ~# dd if=/dev/md0 bs=512 count=255 skip=1 of=/tmp/md0.txt [..] physical_volumes { pv0 { id = "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" device = "/dev/md0" status = ["ALLOCATABLE"] flags = [] dev_size = 7814047360 pe_start = 384 pe_count = 953863 } } logical_volumes { lvdata { id = "JiwAjc-qkvI-58Ru-RO8n-r63Z-ll3E-SJazO7" status = ["READ", "WRITE", "VISIBLE"] flags = [] segment_count = 1 [..] Since I saw lvm information, I guess I haven't lost all information yet... I tried an unhoped command : ~# pvcreate --uuid "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" --restorefile /etc/lvm/archive/lvm-raid_00302.vg /dev/md0 Then, ~# vgcfgrestore lvm-raid ~# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices lvdata lvm-raid -wi-a- 450,00g /dev/md0(148480) lvmp lvm-raid -wi-a- 80,00g /dev/md0(263680) Then : ~# lvchange -ay /dev/lvm-raid/lv* I was quite happy until now. Problem appears now when I try to mount those 2 LV (lvdata & lvmp) as ext4 partition : ~# mount /home/foo/RAID_mp/ ~# mount | grep -i mp /dev/mapper/lvm--raid-lvmp on /home/foo/RAID_mp type ext4 (rw) ~# df -h /home/foo/RAID_mp Filesystem Size Used Avail Use% Mounted on /dev/mapper/lvm--raid-lvmp 79G 61G 19G 77% /home/foo/RAID_mp Here is the big problem ~# ls -la /home/foo/RAID_mp total 0 I did a LVM R/W snapshot on the /dev/mapper/lvm--raid-lvmp LV, I fsck it. I recover 50% of the files only, all located in lost-+found/ directory with names heading with #xxxxx. I would like to know if there is a last chance to recover my data ? Thanks ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Corrupted ext4 filesystem after mdadm manipulation error 2014-04-24 5:05 Corrupted ext4 filesystem after mdadm manipulation error L.M.J @ 2014-04-24 17:48 ` L.M.J [not found] ` <CAK_KU4a+Ep7=F=NSbb-hqN6Rvayx4QPWm-M2403OHn5-LVaNZw@mail.gmail.com> 0 siblings, 1 reply; 19+ messages in thread From: L.M.J @ 2014-04-24 17:48 UTC (permalink / raw) To: linux-raid Up please :-( Le Thu, 24 Apr 2014 07:05:48 +0200, "L.M.J" <linuxmasterjedi@free.fr> a écrit : > Hi, > > For the third time, I had to change a failed drive from my home linux RAID5 box. Previous one went right and > this time, I don't know what I did wrong, but I broke my RAID5. Well, at least, he didn't want to > start. /dev/sdb was the failed drive /dev/sdc and /dev/sdd are OK. > > I tried to reassemble the RAID with this command after I replace sdb and create a new partition : > > ~# mdadm -Cv /dev/md0 --assume-clean --level=5 --raid-devices=3 /dev/sdc1 /dev/sdd1 /dev/sdb1 > -> '-C' was not a good idea here > > Well, I guess I did an another mistake here, I should have done this instead : > ~# mdadm -Av /dev/md0 --assume-clean --level=5 --raid-devices=3 /dev/sdc1 /dev/sdd1 missing > > Maybe this wipe out my data... > Let's go futher, then, pvdisplay, pvscan, vgdisplay returns empty information > > Google helped me, and I did this : > ~# dd if=/dev/md0 bs=512 count=255 skip=1 of=/tmp/md0.txt > > [..] > physical_volumes { > pv0 { > id = "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" > device = "/dev/md0" > status = ["ALLOCATABLE"] > flags = [] > dev_size = 7814047360 > pe_start = 384 > pe_count = 953863 > } > } > logical_volumes { > > lvdata { > id = "JiwAjc-qkvI-58Ru-RO8n-r63Z-ll3E-SJazO7" > status = ["READ", "WRITE", "VISIBLE"] > flags = [] > segment_count = 1 > [..] > > Since I saw lvm information, I guess I haven't lost all information yet... > > I tried an unhoped command : > ~# pvcreate --uuid "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" > --restorefile /etc/lvm/archive/lvm-raid_00302.vg /dev/md0 Then, > > ~# vgcfgrestore lvm-raid > > ~# lvs -a -o +devices > LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices > lvdata lvm-raid -wi-a- 450,00g /dev/md0(148480) > lvmp lvm-raid -wi-a- 80,00g /dev/md0(263680) > Then : > ~# lvchange -ay /dev/lvm-raid/lv* > > I was quite happy until now. > Problem appears now when I try to mount those 2 LV (lvdata & lvmp) as ext4 partition : > ~# mount /home/foo/RAID_mp/ > > ~# mount | grep -i mp > /dev/mapper/lvm--raid-lvmp on /home/foo/RAID_mp type ext4 (rw) > > ~# df -h /home/foo/RAID_mp > Filesystem Size Used Avail Use% Mounted on > /dev/mapper/lvm--raid-lvmp 79G 61G 19G 77% /home/foo/RAID_mp > > Here is the big problem > ~# ls -la /home/foo/RAID_mp > total 0 > > I did a LVM R/W snapshot on the /dev/mapper/lvm--raid-lvmp LV, I fsck it. I recover 50% of the files only, > all located in lost-+found/ directory with names heading with #xxxxx. > > I would like to know if there is a last chance to recover my data ? > > Thanks > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAK_KU4a+Ep7=F=NSbb-hqN6Rvayx4QPWm-M2403OHn5-LVaNZw@mail.gmail.com>]
* Re: Corrupted ext4 filesystem after mdadm manipulation error [not found] ` <CAK_KU4a+Ep7=F=NSbb-hqN6Rvayx4QPWm-M2403OHn5-LVaNZw@mail.gmail.com> @ 2014-04-24 18:35 ` L.M.J [not found] ` <CAK_KU4Zh-azXEEzW4f1m=boCZDKevqaSHxW0XoAgRdrCbm2PkA@mail.gmail.com> 0 siblings, 1 reply; 19+ messages in thread From: L.M.J @ 2014-04-24 18:35 UTC (permalink / raw) To: Scott D'Vileskis; +Cc: linux-raid Hello Scott, Do you think I've lost my data 100% for sure ? fsck recovered 50% of the files, don't you thing there is still something to save ? Thanks Le Thu, 24 Apr 2014 14:13:05 -0400, "Scott D'Vileskis" <sdvileskis@gmail.com> a écrit : > NEVER USE "CREATE" ON FILESYSTEMS OR RAID ARRAYS UNLESS YOU KNOW WHAT YOU > ARE DOING! > CREATE destroys things in the creation process, especially with the --force > option. > > The create argument is only done to create a new array, it will start with > two drives as 'good' drives and the last will likely be the degraded drive, > so it will start resyncing and blowing away data on the last drive. If you > used the --assume clean argument, and it DID NOT resync the drives, you > might be able to recreate the array with the two good disks, provided you > know the original order. > > If you used the --create option, and didn't have your disks in the same > order they were originally in, you probably lost your data. > > Since you replaced a disk, with no data (or worse, with bad data), you > should have assembled the array, in degraded mode WITHOUT the > --assume-clean argument. > > If C & D contain your data, and B used to.. > mdadm --assemble /dev/md0 missing /dev/sdc1 /dev/sdd1 > You might have to --force the assembly. If it works, and it runs in > degraded mode, mount your filesystem and take a backup. > > Next, then add your replacement drive back in: > mdadm --add /dev/md0 /dev/sdb1 > (Note, if sdb1 has some superblock data, you might have to > --zero-superblock first) > > > Good luck. > > > On Thu, Apr 24, 2014 at 1:48 PM, L.M.J <linuxmasterjedi@free.fr> wrote: > > > Up please :-( > > > > Le Thu, 24 Apr 2014 07:05:48 +0200, > > "L.M.J" <linuxmasterjedi@free.fr> a écrit : > > > > > Hi, > > > > > > For the third time, I had to change a failed drive from my home linux > > RAID5 box. Previous one went right and > > > this time, I don't know what I did wrong, but I broke my RAID5. Well, at > > least, he didn't want to > > > start. /dev/sdb was the failed drive /dev/sdc and /dev/sdd are OK. > > > > > > I tried to reassemble the RAID with this command after I replace sdb and > > create a new partition : > > > > > > ~# mdadm -Cv /dev/md0 --assume-clean --level=5 --raid-devices=3 > > /dev/sdc1 /dev/sdd1 /dev/sdb1 > > > -> '-C' was not a good idea here > > > > > > Well, I guess I did an another mistake here, I should have done this > > instead : > > > ~# mdadm -Av /dev/md0 --assume-clean --level=5 --raid-devices=3 > > /dev/sdc1 /dev/sdd1 missing > > > > > > Maybe this wipe out my data... > > > Let's go futher, then, pvdisplay, pvscan, vgdisplay returns empty > > information > > > > > > Google helped me, and I did this : > > > ~# dd if=/dev/md0 bs=512 count=255 skip=1 of=/tmp/md0.txt > > > > > > [..] > > > physical_volumes { > > > pv0 { > > > id = "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" > > > device = "/dev/md0" > > > status = ["ALLOCATABLE"] > > > flags = [] > > > dev_size = 7814047360 > > > pe_start = 384 > > > pe_count = 953863 > > > } > > > } > > > logical_volumes { > > > > > > lvdata { > > > id = "JiwAjc-qkvI-58Ru-RO8n-r63Z-ll3E-SJazO7" > > > status = ["READ", "WRITE", "VISIBLE"] > > > flags = [] > > > segment_count = 1 > > > [..] > > > > > > Since I saw lvm information, I guess I haven't lost all information > > yet... > > > > > > I tried an unhoped command : > > > ~# pvcreate --uuid "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" > > > --restorefile /etc/lvm/archive/lvm-raid_00302.vg /dev/md0 Then, > > > > > > ~# vgcfgrestore lvm-raid > > > > > > ~# lvs -a -o +devices > > > LV VG Attr LSize Origin Snap% Move Log Copy% Convert > > Devices > > > lvdata lvm-raid -wi-a- 450,00g > > /dev/md0(148480) > > > lvmp lvm-raid -wi-a- 80,00g > > /dev/md0(263680) > > > Then : > > > ~# lvchange -ay /dev/lvm-raid/lv* > > > > > > I was quite happy until now. > > > Problem appears now when I try to mount those 2 LV (lvdata & lvmp) as > > ext4 partition : > > > ~# mount /home/foo/RAID_mp/ > > > > > > ~# mount | grep -i mp > > > /dev/mapper/lvm--raid-lvmp on /home/foo/RAID_mp type ext4 (rw) > > > > > > ~# df -h /home/foo/RAID_mp > > > Filesystem Size Used Avail Use% > > Mounted on > > > /dev/mapper/lvm--raid-lvmp 79G 61G 19G 77% > > /home/foo/RAID_mp > > > > > > Here is the big problem > > > ~# ls -la /home/foo/RAID_mp > > > total 0 > > > > > > I did a LVM R/W snapshot on the /dev/mapper/lvm--raid-lvmp LV, I fsck > > it. I recover 50% of the files only, > > > all located in lost-+found/ directory with names heading with #xxxxx. > > > > > > I would like to know if there is a last chance to recover my data ? > > > > > > Thanks > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAK_KU4Zh-azXEEzW4f1m=boCZDKevqaSHxW0XoAgRdrCbm2PkA@mail.gmail.com>]
* Re: Corrupted ext4 filesystem after mdadm manipulation error [not found] ` <CAK_KU4Zh-azXEEzW4f1m=boCZDKevqaSHxW0XoAgRdrCbm2PkA@mail.gmail.com> @ 2014-04-24 19:53 ` L.M.J [not found] ` <CAK_KU4aDDaUSGgcGBwCeO+yE0Qa_pUmMdAHMu7pqO7dqEEC71g@mail.gmail.com> 1 sibling, 0 replies; 19+ messages in thread From: L.M.J @ 2014-04-24 19:53 UTC (permalink / raw) Cc: Scott D'Vileskis, linux-raid Le Thu, 24 Apr 2014 15:39:11 -0400, "Scott D'Vileskis" <sdvileskis@gmail.com> a écrit : > Your data is split 3 ways.. 50% on one disk, 50% on another disk, and one > disk worth of parity. > > Now, it's not that simple, because the data is not continuous.. It is > written across the three drives in chinks, with the parity alternating > between the three drives. > > If you were able to recover 50%, it probably means one disk contains valid > data. > > Were you able to recover anything larger than your chunk size? Are larger > files (Mp3s and or movies) actually playable? Likely not. I ran a fsck on a snapshot lvm partition. It recovered a 50% of the file, all of them are located in /lost+found/ Here is the size 5,5M 2013-04-24 17:53 #4456582 5,7M 2013-04-24 17:53 #4456589 16M 2013-04-24 17:53 #4456590 25M 2013-04-24 17:53 #4456594 17M 2013-04-24 17:53 #4456578 18M 2013-04-24 17:53 #4456580 1,3M 2013-04-24 17:54 #4456597 1,1M 2013-04-24 17:54 #4456596 17M 2013-04-24 17:54 #4456595 2,1M 2013-04-24 17:54 #4456599 932K 2013-04-24 17:54 #4456598 > You might get lucky trying to assemble the array in degraded mode with the > 2 good disks, as long as the array didn't resync your new disk + good disk > to the other good disk... I try that already : re-assemble the array with the good disk and then add the new one. It didn't work as expected. > If added properly, it would have resynced the two good disks with the blank > disk. Try doing a 'hd /dev/sdb1' to see if there is data on the new disk ~# hd /dev/sdb1 00000000 37 53 2f 78 4b 00 13 6f 41 43 55 5b 45 14 08 16 |7S/xK..oACU[E...| 00000010 01 03 7e 2a 11 63 13 6f 6b 01 64 6b 03 07 1a 06 |..~*.c.ok.dk....| 00000020 04 56 44 00 46 2a 32 6e 02 4d 56 12 6d 54 6d 66 |.VD.F*2n.MV.mTmf| 00000030 4b 06 18 00 41 49 28 27 4c 38 30 6b 27 2d 1f 25 |K...AI('L80k'-.%| 00000040 07 59 22 0c 19 5e 4c 39 25 2f 27 59 2f 7c 79 10 |.Y"..^L9%/'Y/|y.| 00000050 31 7a 4b 6e 53 49 41 56 13 39 15 4b 58 29 0f 15 |1zKnSIAV.9.KX)..| 00000060 0b 18 09 0f 6b 68 48 0e 7f 03 24 17 66 01 45 12 |....khH...$.f.E.| 00000070 31 1b 7e 1d 14 3c 10 0f 19 70 2d 05 10 2e 51 2a |1.~..<...p-...Q*| 00000080 4e 54 3a 29 7f 00 45 5a 4d 3e 4c 26 1a 22 2b 57 |NT:)..EZM>L&."+W| 00000090 33 7e 46 51 41 56 79 2a 4e 45 3c 30 6f 1d 11 56 |3~FQAVy*NE<0o..V| 000000a0 4d 1e 64 07 2b 02 1d 01 31 11 58 49 45 5f 7e 2a |M.d.+...1.XIE_~*| 000000b0 4e 45 57 67 00 16 00 54 4e 0f 55 10 1b 14 1c 00 |NEWg...TN.U.....| 000000c0 7f 58 58 45 54 5b 46 10 0d 2a 3a 7e 1c 08 11 45 |.XXET[F..*:~...E| 000000d0 53 54 7d 10 01 14 1e 07 48 52 54 10 3f 55 58 45 |ST}.....HRT.?UXE| 000000e0 64 61 2b 0a 19 1f 45 1d 1d 02 4b 7e 1d 1b 19 02 |da+...E...K~....| 000000f0 0d 4c 2a 4e 54 50 05 06 01 3e 17 0e 57 64 17 4f |.L*NTP...>..Wd.O| 00000100 4a 7f 42 7d 4c 52 09 49 53 45 43 1e 7c 6e 12 00 |J.B}LR.ISEC.|n..| 00000110 13 36 03 0b 12 50 4e 48 34 7e 7d 3a 45 12 28 51 |.6...PNH4~}:E.(Q| 00000120 2a 48 3e 3a 42 58 51 7a 2e 62 12 7e 4e 32 2a 17 |*H>:BXQz.b.~N2*.| [...] PS : Why in this list 'reply' answer to the previous email sender instead of the ML email address ? > > On Thu, Apr 24, 2014 at 2:35 PM, L.M.J <linuxmasterjedi@free.fr> wrote: > > > Hello Scott, > > > > Do you think I've lost my data 100% for sure ? fsck recovered 50% of the > > files, don't you thing there is > > still something to save ? > > > > Thanks > > > > > > Le Thu, 24 Apr 2014 14:13:05 -0400, > > "Scott D'Vileskis" <sdvileskis@gmail.com> a écrit : > > > > > NEVER USE "CREATE" ON FILESYSTEMS OR RAID ARRAYS UNLESS YOU KNOW WHAT YOU > > > ARE DOING! > > > CREATE destroys things in the creation process, especially with the > > --force > > > option. > > > > > > The create argument is only done to create a new array, it will start > > with > > > two drives as 'good' drives and the last will likely be the degraded > > drive, > > > so it will start resyncing and blowing away data on the last drive. If > > you > > > used the --assume clean argument, and it DID NOT resync the drives, you > > > might be able to recreate the array with the two good disks, provided you > > > know the original order. > > > > > > If you used the --create option, and didn't have your disks in the same > > > order they were originally in, you probably lost your data. > > > > > > Since you replaced a disk, with no data (or worse, with bad data), you > > > should have assembled the array, in degraded mode WITHOUT the > > > --assume-clean argument. > > > > > > If C & D contain your data, and B used to.. > > > mdadm --assemble /dev/md0 missing /dev/sdc1 /dev/sdd1 > > > You might have to --force the assembly. If it works, and it runs in > > > degraded mode, mount your filesystem and take a backup. > > > > > > Next, then add your replacement drive back in: > > > mdadm --add /dev/md0 /dev/sdb1 > > > (Note, if sdb1 has some superblock data, you might have to > > > --zero-superblock first) > > > > > > > > > Good luck. > > > > > > > > > On Thu, Apr 24, 2014 at 1:48 PM, L.M.J <linuxmasterjedi@free.fr> wrote: > > > > > > > Up please :-( > > > > > > > > Le Thu, 24 Apr 2014 07:05:48 +0200, > > > > "L.M.J" <linuxmasterjedi@free.fr> a écrit : > > > > > > > > > Hi, > > > > > > > > > > For the third time, I had to change a failed drive from my home linux > > > > RAID5 box. Previous one went right and > > > > > this time, I don't know what I did wrong, but I broke my RAID5. > > Well, at > > > > least, he didn't want to > > > > > start. /dev/sdb was the failed drive /dev/sdc and /dev/sdd are OK. > > > > > > > > > > I tried to reassemble the RAID with this command after I replace sdb > > and > > > > create a new partition : > > > > > > > > > > ~# mdadm -Cv /dev/md0 --assume-clean --level=5 --raid-devices=3 > > > > /dev/sdc1 /dev/sdd1 /dev/sdb1 > > > > > -> '-C' was not a good idea here > > > > > > > > > > Well, I guess I did an another mistake here, I should have done this > > > > instead : > > > > > ~# mdadm -Av /dev/md0 --assume-clean --level=5 --raid-devices=3 > > > > /dev/sdc1 /dev/sdd1 missing > > > > > > > > > > Maybe this wipe out my data... > > > > > Let's go futher, then, pvdisplay, pvscan, vgdisplay returns empty > > > > information > > > > > > > > > > Google helped me, and I did this : > > > > > ~# dd if=/dev/md0 bs=512 count=255 skip=1 of=/tmp/md0.txt > > > > > > > > > > [..] > > > > > physical_volumes { > > > > > pv0 { > > > > > id = "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" > > > > > device = "/dev/md0" > > > > > status = ["ALLOCATABLE"] > > > > > flags = [] > > > > > dev_size = 7814047360 > > > > > pe_start = 384 > > > > > pe_count = 953863 > > > > > } > > > > > } > > > > > logical_volumes { > > > > > > > > > > lvdata { > > > > > id = "JiwAjc-qkvI-58Ru-RO8n-r63Z-ll3E-SJazO7" > > > > > status = ["READ", "WRITE", "VISIBLE"] > > > > > flags = [] > > > > > segment_count = 1 > > > > > [..] > > > > > > > > > > Since I saw lvm information, I guess I haven't lost all information > > > > yet... > > > > > > > > > > I tried an unhoped command : > > > > > ~# pvcreate --uuid "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" > > > > > --restorefile /etc/lvm/archive/lvm-raid_00302.vg /dev/md0 Then, > > > > > > > > > > ~# vgcfgrestore lvm-raid > > > > > > > > > > ~# lvs -a -o +devices > > > > > LV VG Attr LSize Origin Snap% Move Log Copy% > > Convert > > > > Devices > > > > > lvdata lvm-raid -wi-a- 450,00g > > > > /dev/md0(148480) > > > > > lvmp lvm-raid -wi-a- 80,00g > > > > /dev/md0(263680) > > > > > Then : > > > > > ~# lvchange -ay /dev/lvm-raid/lv* > > > > > > > > > > I was quite happy until now. > > > > > Problem appears now when I try to mount those 2 LV (lvdata & lvmp) as > > > > ext4 partition : > > > > > ~# mount /home/foo/RAID_mp/ > > > > > > > > > > ~# mount | grep -i mp > > > > > /dev/mapper/lvm--raid-lvmp on /home/foo/RAID_mp type ext4 (rw) > > > > > > > > > > ~# df -h /home/foo/RAID_mp > > > > > Filesystem Size Used Avail Use% > > > > Mounted on > > > > > /dev/mapper/lvm--raid-lvmp 79G 61G 19G 77% > > > > /home/foo/RAID_mp > > > > > > > > > > Here is the big problem > > > > > ~# ls -la /home/foo/RAID_mp > > > > > total 0 > > > > > > > > > > I did a LVM R/W snapshot on the /dev/mapper/lvm--raid-lvmp LV, I fsck > > > > it. I recover 50% of the files only, > > > > > all located in lost-+found/ directory with names heading with #xxxxx. > > > > > > > > > > I would like to know if there is a last chance to recover my data ? > > > > > > > > > > Thanks > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe > > linux-raid" in > > > > > the body of a message to majordomo@vger.kernel.org > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-raid" > > in > > > > the body of a message to majordomo@vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAK_KU4aDDaUSGgcGBwCeO+yE0Qa_pUmMdAHMu7pqO7dqEEC71g@mail.gmail.com>]
* Re: Corrupted ext4 filesystem after mdadm manipulation error [not found] ` <CAK_KU4aDDaUSGgcGBwCeO+yE0Qa_pUmMdAHMu7pqO7dqEEC71g@mail.gmail.com> @ 2014-04-24 19:56 ` L.M.J 2014-04-24 20:31 ` Scott D'Vileskis [not found] ` <CAK_KU4YUejncX9yQk4HM5HE=1-qPPxOibuRauFheo3jaBc8SaQ@mail.gmail.com> 0 siblings, 2 replies; 19+ messages in thread From: L.M.J @ 2014-04-24 19:56 UTC (permalink / raw) To: linux-raid; +Cc: Scott D'Vileskis Le Thu, 24 Apr 2014 15:43:33 -0400, "Scott D'Vileskis" <sdvileskis@gmail.com> a écrit : > Note, if you dare to --create the array again with your two previous disks, you'll want to create the array > with the 'missing' disk in the right place. > > mdadm --create /dev/md0 missing /dev/sdc1 /dev/sdd1 > (Assuming your original array was sdb, sdc, sdd) > > You'll probably need a force and maybe a start-degraded. > > Then, I would try the recovery on the resulting drive. I think I have enough messed up with my drives, It might be dangerous to recreate again and again the array :-( -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Corrupted ext4 filesystem after mdadm manipulation error 2014-04-24 19:56 ` L.M.J @ 2014-04-24 20:31 ` Scott D'Vileskis 2014-04-24 22:25 ` Why would a recreation cause a different number of blocks?? Jeff Wiegley [not found] ` <CAK_KU4YUejncX9yQk4HM5HE=1-qPPxOibuRauFheo3jaBc8SaQ@mail.gmail.com> 1 sibling, 1 reply; 19+ messages in thread From: Scott D'Vileskis @ 2014-04-24 20:31 UTC (permalink / raw) To: L.M.J; +Cc: linux-raid@vger.kernel.org I have been replying directly to you, not to the mailing list, since your case seems to be a case of user-screwed-up-his-own-data, and not a problem with mdadm/linux raid, nor a problem that will necessarily help someone else (since it is not likely someone will create a mess in exactly the same manner you have). Also, it is easier to click reply than reply-all and have to worry about the top-posting police getting on my case. To summarize: 1) You lost a disk. Even down a disk, you should have been able to run/start the array (in degraded mode) with only 2 disks, mounted the filesystem, backed up data, etc. 2) You then should have simply partitioned and then --add 'ed the new disk. mdadm would have written a superblock to the new disk, and resynced the data Would have-- Could have-- Should have--- Hindsight is 20/20, a mistake was made, it happens to all of us at some point or another, (I've lost arrays and filesystems with careless use of 'dd' once upon a time; Once, I was giving a raid demo to a friend with loop devices, mistyped something, and blew something else away) Unfortunately, you might have clobbered your drives by recreating the array. I assume your original disks were in the order sdb, sdc, sdd. If so, you certainly clobbered your superblocks and changed the order when you did this: > ~# mdadm -Cv /dev/md0 --assume-clean --level=5 --raid-devices=3 /dev/sdc1 /dev/sdd1 /dev/sdb1 You changed the order, but because of the assume-clean, it shouldn't have started a resync of the data. Your file system probably had a fit though, since your data was put effectively through a 3-piece strip-type paper-shredder. You should be able to reorder things though.. IMPORTANT: At any point did your drives do a resync? Assuming no, and assuming you haven't done any other writing to your disks(besides rewriting the superblocks), you can probably correct the order of your drives by reissuing the --create command with the two original drives, in the proper order, and the missing drive as the placeholder. (This will rewrite the superblocks again, but hopefully in the right order). mdadm -Cv /dev/md0 --level=5 --raid-devices=3 missing /dev/sdc1 /dev/sdd1 Note, you need the 'missing' drive, so the raid calculates the missing data, instead of reading chunks from a blank drive. If you can start that array with 2 devices (it will be degraded with only 2/3 drives) you should be able to mount and recover your data. You may need to run a full fsck again since your last fsck probably made a mess. Assuming you can mount and copy your data, you can then --add your 'new' drive to the array with the --add argument. (Note, you'll have to clear it's superblock or mdadm will object) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Why would a recreation cause a different number of blocks?? 2014-04-24 20:31 ` Scott D'Vileskis @ 2014-04-24 22:25 ` Jeff Wiegley 2014-04-25 3:34 ` Mikael Abrahamsson 0 siblings, 1 reply; 19+ messages in thread From: Jeff Wiegley @ 2014-04-24 22:25 UTC (permalink / raw) Cc: linux-raid@vger.kernel.org Still trying to restore my large storage system with out total screwing up. There are two different raid md devices. both had their superblocks wiped and one of the six drives is screwed (the other 5 are fine). Before the human failure: (OS reinstall and I only deleted the MD devices in the ubuntu installer. I think this just zeros the md superblocks of the affected partitions) Personalities : [raid6] [raid5] [raid4] [raid1] [linear] [multipath] [raid0] [raid10] md3 : active raid6 sda1[0] sdc1[2] sde1[4] sdb1[1] sdd1[6] 1073735680 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUUUU_] I recreated the device with: mdadm --create --assume-clean --level=6 --raid-devices=6 /dev/md0 /dev/sdd1 /dev/sdb1 /dev/sde1 /dev/sdc1 /dev/sda1 missing and now it reports: root@nas:~# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sda1[4] sdc1[3] sde1[2] sdb1[1] sdd1[0] 1073215488 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUUUU_] Why did my block counts change? The disk partitions weren't touched or changed at any point. Shouldn't I have gotten the same size? The created device isn't work. There is suppose to be luks encrypted volume there but luksOpen reports there is no luks header. (and there use to be). Would the odd change in size indicate total corruption? - Jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Why would a recreation cause a different number of blocks?? 2014-04-24 22:25 ` Why would a recreation cause a different number of blocks?? Jeff Wiegley @ 2014-04-25 3:34 ` Mikael Abrahamsson 2014-04-25 5:02 ` Jeff Wiegley 0 siblings, 1 reply; 19+ messages in thread From: Mikael Abrahamsson @ 2014-04-25 3:34 UTC (permalink / raw) To: Jeff Wiegley; +Cc: linux-raid@vger.kernel.org On Thu, 24 Apr 2014, Jeff Wiegley wrote: > Why did my block counts change? The disk partitions weren't touched > or changed at any point. Shouldn't I have gotten the same size? Defaults in mdadm has changed over time, so data offsets might be different. In order to get the exact same data offset you need to use the same mdadm version as was originally used, or at least know the values it used and use mdadm 3.3 that allows you to specify these data at creation time, > The created device isn't work. There is suppose to be luks encrypted > volume there but luksOpen reports there is no luks header. (and there > use to be). Would the odd change in size indicate total corruption? No, the change in size indicates that data offsets are not the same so your beginning of volume is now in the wrong place. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Why would a recreation cause a different number of blocks?? 2014-04-25 3:34 ` Mikael Abrahamsson @ 2014-04-25 5:02 ` Jeff Wiegley 2014-04-25 6:01 ` Mikael Abrahamsson 0 siblings, 1 reply; 19+ messages in thread From: Jeff Wiegley @ 2014-04-25 5:02 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: linux-raid@vger.kernel.org I'll look into this. I kind of thought about that sort of thing and I went and installed ubuntu 12.04 which I thought was what I started this all with but I might have done it earlier than 12.04 and I might have used gentoo. are the mdadm defaults specific to mdadm version or would ubuntu and gentoo have specified different defaults in something like an /etc/defaults/ourmdadm.cfg? If it's mdadm then could I just grab old copies of mdadm sources and compile them one version after the other and try each one? Thanks, - Jeff On 4/24/2014 8:34 PM, Mikael Abrahamsson wrote: > On Thu, 24 Apr 2014, Jeff Wiegley wrote: > >> Why did my block counts change? The disk partitions weren't touched >> or changed at any point. Shouldn't I have gotten the same size? > > Defaults in mdadm has changed over time, so data offsets might be > different. In order to get the exact same data offset you need to use the > same mdadm version as was originally used, or at least know the values it > used and use mdadm 3.3 that allows you to specify these data at creation > time, > >> The created device isn't work. There is suppose to be luks encrypted >> volume there but luksOpen reports there is no luks header. (and there >> use to be). Would the odd change in size indicate total corruption? > > No, the change in size indicates that data offsets are not the same so > your beginning of volume is now in the wrong place. > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Why would a recreation cause a different number of blocks?? 2014-04-25 5:02 ` Jeff Wiegley @ 2014-04-25 6:01 ` Mikael Abrahamsson 2014-04-25 6:45 ` Jeff Wiegley 2014-04-25 7:05 ` Jeff Wiegley 0 siblings, 2 replies; 19+ messages in thread From: Mikael Abrahamsson @ 2014-04-25 6:01 UTC (permalink / raw) To: Jeff Wiegley; +Cc: linux-raid@vger.kernel.org On Thu, 24 Apr 2014, Jeff Wiegley wrote: > If it's mdadm then could I just grab old copies of mdadm sources and > compile them one version after the other and try each one? As far as I know, it's mdadm version specific. If you look in the archives I'm sure you'll be able to find the old offsets and you can use the latest mdadm with those offsets and hopefully things will work. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Why would a recreation cause a different number of blocks?? 2014-04-25 6:01 ` Mikael Abrahamsson @ 2014-04-25 6:45 ` Jeff Wiegley 2014-04-25 7:25 ` Mikael Abrahamsson 2014-04-25 7:05 ` Jeff Wiegley 1 sibling, 1 reply; 19+ messages in thread From: Jeff Wiegley @ 2014-04-25 6:45 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: linux-raid@vger.kernel.org ooooh... Making progress. I downloaded and compiled mdadm-3.1.4 and used that to create the array. The size is the same and luksOpen recognizes it as luks and asks for and accepts the passphrase. however mount says it need to be told the filesystem type and if I add -t xfs then it still fails to mount the filesystem. Any thoughts why the recreated array would satisfy and pass cryptsetup's sanity checks but the resulting decrypted data is not recognizable as xfs? - Jeff On 4/24/2014 11:01 PM, Mikael Abrahamsson wrote: > On Thu, 24 Apr 2014, Jeff Wiegley wrote: > >> If it's mdadm then could I just grab old copies of mdadm sources and >> compile them one version after the other and try each one? > > As far as I know, it's mdadm version specific. If you look in the archives > I'm sure you'll be able to find the old offsets and you can use the latest > mdadm with those offsets and hopefully things will work. > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Why would a recreation cause a different number of blocks?? 2014-04-25 6:45 ` Jeff Wiegley @ 2014-04-25 7:25 ` Mikael Abrahamsson 0 siblings, 0 replies; 19+ messages in thread From: Mikael Abrahamsson @ 2014-04-25 7:25 UTC (permalink / raw) To: Jeff Wiegley; +Cc: linux-raid@vger.kernel.org On Thu, 24 Apr 2014, Jeff Wiegley wrote: > Any thoughts why the recreated array would satisfy and pass cryptsetup's > sanity checks but the resulting decrypted data is not recognizable as > xfs? Cryptsetup probably has a very small superblock which fits in one chunk, so if you got the order of the other drives wrong, then the fs will still be garbled while cryptsetup will think everything is fine. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Why would a recreation cause a different number of blocks?? 2014-04-25 6:01 ` Mikael Abrahamsson 2014-04-25 6:45 ` Jeff Wiegley @ 2014-04-25 7:05 ` Jeff Wiegley 1 sibling, 0 replies; 19+ messages in thread From: Jeff Wiegley @ 2014-04-25 7:05 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: linux-raid@vger.kernel.org Here's something about my closer call... In the original the mdstat lists: md3 : active raid6 sda1[0] sdc1[2] sde1[4] sdb1[1] sdd1[6] 1073735680 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUUUU_] What do the numbers after the drives mean? It appears to match the number in the Device Role in an examine. When I recreated the array with 3.1.4 I used: /mdadm --create --assume-clean --level=6 --raid-devices=6 /dev/md0 /dev/sdd1 /dev/sdb1 /dev/sde1 /dev/sdc1 /dev/sda1 missing now mdadm (which luks is happy with) reports: Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdd1[0] sda1[4] sdc1[3] sde1[2] sdb1[1] 1073735680 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUUUU_] which are not the same numbers. Now I figure I can reorder my command line arguments but (due to prior drive failures) I think the mapping (from the original) should be: 0: /dev/sda1 1: /dev/sdb1 2: /dev/sdc1 3: ???? 4: /dev/sde1 5: ????? 6: /dev/sdd1 7: /dev/sdf1 (this is my current dead drive that I have to leave out and bring the array degraded because this drive is probably very out of sync) Though I know to substitute "missing" for /dev/sdf1 to leave it out, my question is: What do I do about the active devices 3 and 5 on the command line? Also put missing for those?? I don't think that will work because wouldn't it think I have 3 drives dead and refuse to start the array? But it does seem like I'm getting closer... and if I can get this partition up then I have high probability of recovering the larger important array that is in /dev/sd[a-o]2 Thanks again, - Jeff On 4/24/2014 11:01 PM, Mikael Abrahamsson wrote: > On Thu, 24 Apr 2014, Jeff Wiegley wrote: > >> If it's mdadm then could I just grab old copies of mdadm sources and >> compile them one version after the other and try each one? > > As far as I know, it's mdadm version specific. If you look in the archives > I'm sure you'll be able to find the old offsets and you can use the latest > mdadm with those offsets and hopefully things will work. > ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAK_KU4YUejncX9yQk4HM5HE=1-qPPxOibuRauFheo3jaBc8SaQ@mail.gmail.com>]
* Re: Corrupted ext4 filesystem after mdadm manipulation error [not found] ` <CAK_KU4YUejncX9yQk4HM5HE=1-qPPxOibuRauFheo3jaBc8SaQ@mail.gmail.com> @ 2014-04-25 5:13 ` L.M.J 2014-04-25 6:04 ` Mikael Abrahamsson 0 siblings, 1 reply; 19+ messages in thread From: L.M.J @ 2014-04-25 5:13 UTC (permalink / raw) To: Scott D'Vileskis; +Cc: linux-raid@vger.kernel.org Le Thu, 24 Apr 2014 16:22:49 -0400, "Scott D'Vileskis" <sdvileskis@gmail.com> a écrit : > I have been replying directly to you, not to the mailing list, since your > case seems to be a case of user-screwed-up-his-own-data, and not a problem > with mdadm/linux raid, nor a problem that will necessarily help someone > else (since it is not likely someone will create a mess in exactly the same > manner you have) . Ha OK. > To summarize: > 1) You lost a disk. Even down a disk, you should have been able to > run/start the array (in degraded mode) with only 2 disks, mounted the > filesystem, etc. Yes of course, it worked only with 2 disks the last 3 weeks. > 2) You then should have simply partitioned and then --add 'ed the new disk. > mdadm would have written a superblock to the new disk, and resynced the > data > > I assume your original disks were in the order sdb, sdc, sdd. Exactly > Unfortunately, you might have clobbered your drives by recreating the > array. You certainly clobbered your superblocks and changed the order when > you did this: > > ~# mdadm -Cv /dev/md0 --assume-clean --level=5 --raid-devices=3 /dev/sdc1 > /dev/sdd1 /dev/sdb1 > > You changed the order, but because of the assume-clean, it shouldn't have > started a resync of the data. Your file system probably had a fit though. > > Hindsight is 20/20, a mistake was made, it happens to all of us at some > point or another, (I've lost arrays and filesystems with careless use of > 'dd' once upon a time, once I was giving a raid demo to a friend with loop > devices, mistyped something, and blew something away) > > IMPORTANT: At any point did your drives do a resync? Unfortunatly : yes, resync occurs when I > Assuming no, and assuming you haven't done any other writing to your > disks(besides rewriting the superblocks), you can probably correct the > order of your drives by reissuing the --create command with the two > original drives, in the proper order, and the missing drive as the > placeholder. (This will rewrite the superblocks again, but hopefully in the > right order) > mdadm -Cv /dev/md0 --level=5 --raid-devices=3 missing /dev/sdc1 /dev/sdd1 > > If you can start that array (it will be degraded with only 2/3 drives) you > should be able to mount and recover your data. You may need to run a full > fsck again since your last fsck probably made a mess. I shutdown the computer, remove the old disk, added the new one. Maybe I've messed up with SATA cables too. Unfortunately, I use to start the degraded array like this : ~# mdadm --assemble --force /dev/sdc1 /dev/sdd1 didn't work I created a partition on sdb, and then, the mistake ~# mdadm --stop /dev/md0 ~# mdadm -Cv /dev/md0 --assume-clean --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1 Didn't work better, then ~# mdadm --stop /dev/md0 ~# mdadm --create /dev/md0 --level=5 --assume-clean --raid-devices=3 /dev/sdc1 /dev/sdd1 missing ~# mdadm --manage /dev/md0 --add /dev/sdb1 Looks even worst, isn't it ? > > Assuming you can mount and copy your data, you can then --add your 'new' > drive to the array with the --add argument. (Note, you'll have to clear > it's superblock or mdadm will object) > And what do you think of files fsck may recovered : 5,5M 2013-04-24 17:53 #4456582 5,7M 2013-04-24 17:53 #4456589 16M 2013-04-24 17:53 #4456590 25M 2013-04-24 17:53 #4456594 17M 2013-04-24 17:53 #4456578 18M 2013-04-24 17:53 #4456580 1,3M 2013-04-24 17:54 #4456597 1,1M 2013-04-24 17:54 #4456596 17M 2013-04-24 17:54 #4456595 2,1M 2013-04-24 17:54 #4456599 932K 2013-04-24 17:54 #4456598 Well, what should I do now ? mkfs everything and restart from scratch ? -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Corrupted ext4 filesystem after mdadm manipulation error 2014-04-25 5:13 ` Corrupted ext4 filesystem after mdadm manipulation error L.M.J @ 2014-04-25 6:04 ` Mikael Abrahamsson 2014-04-25 11:43 ` L. M. J 0 siblings, 1 reply; 19+ messages in thread From: Mikael Abrahamsson @ 2014-04-25 6:04 UTC (permalink / raw) To: L.M.J; +Cc: Scott D'Vileskis, linux-raid@vger.kernel.org On Fri, 25 Apr 2014, L.M.J wrote: > Well, what should I do now ? mkfs everything and restart from scratch ? Most likely this is your only option. First you overwrote the superblocks so the drives came in the wrong order (most likely) and then you ran a read/write fsck on it. The thing to do is to make sure md is read only and use fsck in readonly-mode as a diagnostics tool to see if everything is right. If you get it wrong and do fsck in read/write mode it's going to change things destructively. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Corrupted ext4 filesystem after mdadm manipulation error 2014-04-25 6:04 ` Mikael Abrahamsson @ 2014-04-25 11:43 ` L. M. J 2014-04-25 13:36 ` Scott D'Vileskis 0 siblings, 1 reply; 19+ messages in thread From: L. M. J @ 2014-04-25 11:43 UTC (permalink / raw) To: Mikael Abrahamsson; +Cc: Scott D'Vileskis, linux-raid@vger.kernel.org On 25 avril 2014 08:04:13 CEST, Mikael Abrahamsson <swmike@swm.pp.se> wrote: >On Fri, 25 Apr 2014, L.M.J wrote: > >> Well, what should I do now ? mkfs everything and restart from scratch >? > >Most likely this is your only option. First you overwrote the >superblocks >so the drives came in the wrong order (most likely) and then you ran a >read/write fsck on it. > >The thing to do is to make sure md is read only and use fsck in >readonly-mode as a diagnostics tool to see if everything is right. If >you >get it wrong and do fsck in read/write mode it's going to change things > >destructively. I haven't done a R/W fuck yet, only on LVM snapshot to test, never on real data. Does it change anything? -- May the open source be with you my young padawan. Envoyé de mon téléphone, excusez la brièveté. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Corrupted ext4 filesystem after mdadm manipulation error 2014-04-25 11:43 ` L. M. J @ 2014-04-25 13:36 ` Scott D'Vileskis 2014-04-25 14:43 ` L.M.J 2014-04-25 18:37 ` Is disk order relative or are the numbers absolute? Jeff Wiegley 0 siblings, 2 replies; 19+ messages in thread From: Scott D'Vileskis @ 2014-04-25 13:36 UTC (permalink / raw) To: L. M. J; +Cc: linux-raid@vger.kernel.org Drive B has bogus data on it., since it was resync'd with C & D in the wrong order. Fortunately, your --add should have only have changed B, not C & D. As a last ditch effort, try the --create again but with the two potentially good disks in the right order: mdadm --create /dev/md0 --level=5 --raid-devices=3 missing /dev/sdc1 /dev/sdd1 Note: The following is where I have reproduced your problem with loop devices #Create 3 200MB files root@Breadman:/home/scott# mkdir raidtesting root@Breadman:/home/scott# cd raidtesting/ root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdb root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdc root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdd root@Breadman:/home/scott/raidtesting# losetup /dev/loop2 sdb root@Breadman:/home/scott/raidtesting# losetup /dev/loop3 sdc root@Breadman:/home/scott/raidtesting# losetup /dev/loop4 sdd root@Breadman:/home/scott/raidtesting# mdadm --create /dev/md0 -n3 -l5 /dev/loop2 /dev/loop3 /dev/loop4 mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. root@Breadman:/home/scott/raidtesting# cat /proc/mdstat md0 : active raid5 loop4[3] loop3[1] loop2[0] 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] root@Breadman:/home/scott/raidtesting# mkfs.reiserfs /dev/md0 mkfs.reiserfs 3.6.21 (2009 www.namesys.com) <SNIP> ReiserFS is successfully created on /dev/md0. root@Breadman:/home/scott/raidtesting# mkdir temp root@Breadman:/home/scott/raidtesting# mount /dev/md0 temp/ #Then I copied a file to it: root@Breadman:/home/scott/raidtesting# md5sum temp/systemrescuecd-x86-0.4.3.iso b88ce25b156619a9a344889bc92b1833 temp/systemrescuecd-x86-0.4.3.iso #And failed a disk root@Breadman:/home/scott/raidtesting# umount temp/ root@Breadman:/home/scott/raidtesting# mdadm --fail /dev/md0 /dev/loop2 mdadm: set /dev/loop2 faulty in /dev/md0 root@Breadman:/home/scott/raidtesting# cat /proc/mdstat md0 : active raid5 loop4[3] loop3[1] loop2[0](F) 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU] #Stopped array, removed disk, replaced disk by creating a new file root@Breadman:/home/scott/raidtesting# mdadm --stop /dev/md0 mdadm: stopped /dev/md0 root@Breadman:/home/scott/raidtesting# losetup -d /dev/loop2 root@Breadman:/home/scott/raidtesting# rm sdb root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdb-new root@Breadman:/home/scott/raidtesting# losetup /dev/loop2 sdb-new #WRONG: Create array in wrong order root@Breadman:/home/scott/raidtesting# mdadm --create /dev/md0 --assume-clean -l5 -n3 /dev/loop3 /dev/loop4 /dev/loop2 mdadm: /dev/loop3 appears to be part of a raid array: level=raid5 devices=3 ctime=Fri Apr 25 09:10:31 2014 mdadm: /dev/loop4 appears to be part of a raid array: level=raid5 devices=3 ctime=Fri Apr 25 09:10:31 2014 Continue creating array? y mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. root@Breadman:/home/scott/raidtesting# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] md0 : active raid5 loop2[2] loop4[1] loop3[0] 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] root@Breadman:/home/scott/raidtesting# mount /dev/md0 temp/ mount: you must specify the filesystem type #Nope, doesn't mount, filesystem clobbered, or not? root@Breadman:/home/scott/raidtesting# mdadm --stop /dev/md0 mdadm: stopped /dev/md0 #Recreate the array, with missing disk in the right place root@Breadman:/home/scott/raidtesting# mdadm --create /dev/md0 -l5 -n3 missing /dev/loop3 /dev/loop4 mdadm: /dev/loop3 appears to be part of a raid array: level=raid5 devices=3 ctime=Fri Apr 25 09:17:38 2014 mdadm: /dev/loop4 appears to be part of a raid array: level=raid5 devices=3 ctime=Fri Apr 25 09:17:38 2014 Continue creating array? y mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. root@Breadman:/home/scott/raidtesting# mount /dev/md0 temp/ root@Breadman:/home/scott/raidtesting# ls temp/ systemrescuecd-x86-0.4.3.iso root@Breadman:/home/scott/raidtesting# md5sum temp/systemrescuecd-x86-0.4.3.iso b88ce25b156619a9a344889bc92b1833 temp/systemrescuecd-x86-0.4.3.iso #Notice we are in degraded mode root@Breadman:/home/scott/raidtesting# cat /proc/mdstat md0 : active raid5 loop4[2] loop3[1] 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU] #Add our replacement disk: root@Breadman:/home/scott/raidtesting# mdadm --add /dev/md0 /dev/loop2 mdadm: added /dev/loop2 root@Breadman:/home/scott/raidtesting# cat /proc/mdstat md0 : active raid5 loop2[3] loop4[2] loop3[1] 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU] [============>........] recovery = 62.1% (121316/194048) finish=0.0min speed=12132K/sec #After a while (short while with 200MB loop devices): root@Breadman:/home/scott/raidtesting# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [raid10] md0 : active raid5 loop2[3] loop4[2] loop3[1] 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Corrupted ext4 filesystem after mdadm manipulation error 2014-04-25 13:36 ` Scott D'Vileskis @ 2014-04-25 14:43 ` L.M.J 2014-04-25 18:37 ` Is disk order relative or are the numbers absolute? Jeff Wiegley 1 sibling, 0 replies; 19+ messages in thread From: L.M.J @ 2014-04-25 14:43 UTC (permalink / raw) To: Scott D'Vileskis; +Cc: linux-raid@vger.kernel.org Le Fri, 25 Apr 2014 09:36:12 -0400, "Scott D'Vileskis" <sdvileskis@gmail.com> a écrit : > As a last ditch effort, try the --create again but with the two > potentially good disks in the right order: > > mdadm --create /dev/md0 --level=5 --raid-devices=3 missing /dev/sdc1 /dev/sdd1 root@gateway:~# mdadm --create /dev/md0 --level=5 --raid-devices=3 missing /dev/sdc1 /dev/sdd1 mdadm: /dev/sdc1 appears to be part of a raid array: level=raid5 devices=3 ctime=Fri Apr 25 16:20:32 2014 mdadm: /dev/sdd1 appears to be part of a raid array: level=raid5 devices=3 ctime=Fri Apr 25 16:20:32 2014 Continue creating array? y mdadm: array /dev/md0 started. root@gateway:~# ls -l /dev/md* brw-rw---- 1 root disk 9, 0 2014-04-25 16:34 /dev/md0 brw-rw---- 1 root disk 254, 0 2014-04-25 16:19 /dev/md_d0 lrwxrwxrwx 1 root root 7 2014-04-25 16:04 /dev/md_d0p1 -> md/d0p1 lrwxrwxrwx 1 root root 7 2014-04-25 16:04 /dev/md_d0p2 -> md/d0p2 lrwxrwxrwx 1 root root 7 2014-04-25 16:04 /dev/md_d0p3 -> md/d0p3 lrwxrwxrwx 1 root root 7 2014-04-25 16:04 /dev/md_d0p4 -> md/d0p4 /dev/md: total 0 brw------- 1 root root 254, 0 2014-04-25 16:04 d0 brw------- 1 root root 254, 1 2014-04-25 16:04 d0p1 brw------- 1 root root 254, 2 2014-04-25 16:04 d0p2 brw------- 1 root root 254, 3 2014-04-25 16:04 d0p3 brw------- 1 root root 254, 4 2014-04-25 16:04 d0p4 root@gateway:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sdd1[2] sdc1[1] 3907023872 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU] unused devices: <none> root@gateway:~# pvscan No matching physical volumes found root@gateway:~# pvdisplay root@gateway:~# dd if=/dev/md0 of=/tmp/md0.dd count=10 bs=1M 10+0 enregistrements lus 10+0 enregistrements écrits 10485760 octets (10 MB) copiés, 0,271947 s, 38,6 MB/s I can see in /tmp/md0.dd a lot of binary stuff, and sometimes, text : physical_volumes { pv0 { id = "5DZit9-6o5V-a1vu-1D1q-fnc0-syEj-kVwAnW" device = "/dev/md0" status = ["ALLOCATABLE"] flags = [] dev_size = 7814047360 pe_start = 384 pe_count = 953863 } } logical_volumes { lvdata { id = "JiwAjc-qkvI-58Ru-RO8n-r63Z-ll3E-SJazO7" status = ["READ", "WRITE", "VISIBLE"] flags = [] segment_count = 1 segment1 { start_extent = 0 extent_count = 115200 type = "striped" stripe_count = 1 # linear stripes = [ [...] lvdata_snapshot_J5 { id = "Mcvgul-Qo2L-1sPB-LvtI-KuME-fiiM-6DXeph" status = ["READ"] flags = [] segment_count = 1 segment1 { start_extent = 0 extent_count = 25600 type = "striped" stripe_count = 1 # linear stripes = [ "pv0", 284160 ] } } [...] lvdata_snapshot_J5 is a snap I've created a few days before my mdadm chaos, so i'm pretty sure some datas are still on the drives... Am I wrong ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Is disk order relative or are the numbers absolute? 2014-04-25 13:36 ` Scott D'Vileskis 2014-04-25 14:43 ` L.M.J @ 2014-04-25 18:37 ` Jeff Wiegley 1 sibling, 0 replies; 19+ messages in thread From: Jeff Wiegley @ 2014-04-25 18:37 UTC (permalink / raw) To: linux-raid@vger.kernel.org I'm still trying to recover my array and I'm getting close I think I just have to get the disk order correct now. Over the past year I've had a couple of failures. and replaced disks. This changed drive numbers. mdstat before everything went to hell was: md4 : active raid6 sdf2[7](F) sda2[0] sdc2[2] sde2[4] sdb2[1] sdd2[6] 10647314432 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUUUU_] Does this indicate an order of: 0: sda2 1: sdb2 2: sdc2 3: ???? (previous dead drive I replaced I'm guessing) 4: sde2 5: ???? (second previously dead/replaced drive ) 6: sdd2 7: sdf2 (which is currently dead/failed) I have to recreate the array due to zeroing the superblocks during install (though I have not changed partition tables or ever caused a resync of any drives. My question is I know I can get the five good drives recreated into an array. but I don't know how to give them specific numbers. I can get their relative order correct with: --create --assume-clean ... /dev/sd{a,b,c,e,d,f}2 but sde2 will be numbered 3, not 4. sdd2 will be 4, not 5. Will these number changes not make a difference because only relative order is important? Or do I have to figure out some way to force absolute positions/numbers to the drives? Thank you, - Jeff On 4/25/2014 6:36 AM, Scott D'Vileskis wrote: > Drive B has bogus data on it., since it was resync'd with C & D in the > wrong order. Fortunately, your --add should have only have changed B, > not C & D. > > As a last ditch effort, try the --create again but with the two > potentially good disks in the right order: > > mdadm --create /dev/md0 --level=5 --raid-devices=3 missing /dev/sdc1 /dev/sdd1 > > Note: The following is where I have reproduced your problem with loop devices > > #Create 3 200MB files > root@Breadman:/home/scott# mkdir raidtesting > root@Breadman:/home/scott# cd raidtesting/ > root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdb > root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdc > root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdd > root@Breadman:/home/scott/raidtesting# losetup /dev/loop2 sdb > root@Breadman:/home/scott/raidtesting# losetup /dev/loop3 sdc > root@Breadman:/home/scott/raidtesting# losetup /dev/loop4 sdd > root@Breadman:/home/scott/raidtesting# mdadm --create /dev/md0 -n3 -l5 > /dev/loop2 /dev/loop3 /dev/loop4 > mdadm: Defaulting to version 1.2 metadata > mdadm: array /dev/md0 started. > > root@Breadman:/home/scott/raidtesting# cat /proc/mdstat > md0 : active raid5 loop4[3] loop3[1] loop2[0] > 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] > > root@Breadman:/home/scott/raidtesting# mkfs.reiserfs /dev/md0 > mkfs.reiserfs 3.6.21 (2009 www.namesys.com) > <SNIP> > ReiserFS is successfully created on /dev/md0. > root@Breadman:/home/scott/raidtesting# mkdir temp > root@Breadman:/home/scott/raidtesting# mount /dev/md0 temp/ > > #Then I copied a file to it: > root@Breadman:/home/scott/raidtesting# md5sum temp/systemrescuecd-x86-0.4.3.iso > b88ce25b156619a9a344889bc92b1833 temp/systemrescuecd-x86-0.4.3.iso > > #And failed a disk > root@Breadman:/home/scott/raidtesting# umount temp/ > root@Breadman:/home/scott/raidtesting# mdadm --fail /dev/md0 /dev/loop2 > mdadm: set /dev/loop2 faulty in /dev/md0 > root@Breadman:/home/scott/raidtesting# cat /proc/mdstat > md0 : active raid5 loop4[3] loop3[1] loop2[0](F) > 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU] > > #Stopped array, removed disk, replaced disk by creating a new file > root@Breadman:/home/scott/raidtesting# mdadm --stop /dev/md0 > mdadm: stopped /dev/md0 > root@Breadman:/home/scott/raidtesting# losetup -d /dev/loop2 > root@Breadman:/home/scott/raidtesting# rm sdb > root@Breadman:/home/scott/raidtesting# fallocate -l200000000 sdb-new > root@Breadman:/home/scott/raidtesting# losetup /dev/loop2 sdb-new > > #WRONG: Create array in wrong order > root@Breadman:/home/scott/raidtesting# mdadm --create /dev/md0 > --assume-clean -l5 -n3 /dev/loop3 /dev/loop4 /dev/loop2 > mdadm: /dev/loop3 appears to be part of a raid array: > level=raid5 devices=3 ctime=Fri Apr 25 09:10:31 2014 > mdadm: /dev/loop4 appears to be part of a raid array: > level=raid5 devices=3 ctime=Fri Apr 25 09:10:31 2014 > Continue creating array? y > mdadm: Defaulting to version 1.2 metadata > mdadm: array /dev/md0 started. > root@Breadman:/home/scott/raidtesting# cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] > [raid1] [raid10] > md0 : active raid5 loop2[2] loop4[1] loop3[0] > 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] > > root@Breadman:/home/scott/raidtesting# mount /dev/md0 temp/ > mount: you must specify the filesystem type > > #Nope, doesn't mount, filesystem clobbered, or not? > > root@Breadman:/home/scott/raidtesting# mdadm --stop /dev/md0 > mdadm: stopped /dev/md0 > > #Recreate the array, with missing disk in the right place > root@Breadman:/home/scott/raidtesting# mdadm --create /dev/md0 -l5 -n3 > missing /dev/loop3 /dev/loop4 > mdadm: /dev/loop3 appears to be part of a raid array: > level=raid5 devices=3 ctime=Fri Apr 25 09:17:38 2014 > mdadm: /dev/loop4 appears to be part of a raid array: > level=raid5 devices=3 ctime=Fri Apr 25 09:17:38 2014 > Continue creating array? y > mdadm: Defaulting to version 1.2 metadata > mdadm: array /dev/md0 started. > root@Breadman:/home/scott/raidtesting# mount /dev/md0 temp/ > root@Breadman:/home/scott/raidtesting# ls temp/ > systemrescuecd-x86-0.4.3.iso > root@Breadman:/home/scott/raidtesting# md5sum temp/systemrescuecd-x86-0.4.3.iso > b88ce25b156619a9a344889bc92b1833 temp/systemrescuecd-x86-0.4.3.iso > > #Notice we are in degraded mode > root@Breadman:/home/scott/raidtesting# cat /proc/mdstat > md0 : active raid5 loop4[2] loop3[1] > 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU] > > #Add our replacement disk: > root@Breadman:/home/scott/raidtesting# mdadm --add /dev/md0 /dev/loop2 > mdadm: added /dev/loop2 > > root@Breadman:/home/scott/raidtesting# cat /proc/mdstat > md0 : active raid5 loop2[3] loop4[2] loop3[1] > 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [_UU] > [============>........] recovery = 62.1% (121316/194048) > finish=0.0min speed=12132K/sec > > #After a while (short while with 200MB loop devices): > root@Breadman:/home/scott/raidtesting# cat /proc/mdstat > Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] > [raid1] [raid10] > md0 : active raid5 loop2[3] loop4[2] loop3[1] > 388096 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU] > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2014-04-25 18:37 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-04-24 5:05 Corrupted ext4 filesystem after mdadm manipulation error L.M.J 2014-04-24 17:48 ` L.M.J [not found] ` <CAK_KU4a+Ep7=F=NSbb-hqN6Rvayx4QPWm-M2403OHn5-LVaNZw@mail.gmail.com> 2014-04-24 18:35 ` L.M.J [not found] ` <CAK_KU4Zh-azXEEzW4f1m=boCZDKevqaSHxW0XoAgRdrCbm2PkA@mail.gmail.com> 2014-04-24 19:53 ` L.M.J [not found] ` <CAK_KU4aDDaUSGgcGBwCeO+yE0Qa_pUmMdAHMu7pqO7dqEEC71g@mail.gmail.com> 2014-04-24 19:56 ` L.M.J 2014-04-24 20:31 ` Scott D'Vileskis 2014-04-24 22:25 ` Why would a recreation cause a different number of blocks?? Jeff Wiegley 2014-04-25 3:34 ` Mikael Abrahamsson 2014-04-25 5:02 ` Jeff Wiegley 2014-04-25 6:01 ` Mikael Abrahamsson 2014-04-25 6:45 ` Jeff Wiegley 2014-04-25 7:25 ` Mikael Abrahamsson 2014-04-25 7:05 ` Jeff Wiegley [not found] ` <CAK_KU4YUejncX9yQk4HM5HE=1-qPPxOibuRauFheo3jaBc8SaQ@mail.gmail.com> 2014-04-25 5:13 ` Corrupted ext4 filesystem after mdadm manipulation error L.M.J 2014-04-25 6:04 ` Mikael Abrahamsson 2014-04-25 11:43 ` L. M. J 2014-04-25 13:36 ` Scott D'Vileskis 2014-04-25 14:43 ` L.M.J 2014-04-25 18:37 ` Is disk order relative or are the numbers absolute? Jeff Wiegley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).