[linux-lvm] Raid 10 - recovery after a disk failure

linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed

* [linux-lvm] Raid 10 - recovery after a disk failure
       [not found] <218341039.2921649.1454281239033.JavaMail.yahoo.ref@mail.yahoo.com>
@ 2016-01-31 23:00 ` Pavlik Kirilov
  2016-01-31 23:23   ` emmanuel segura
  0 siblings, 1 reply; 6+ messages in thread
From: Pavlik Kirilov @ 2016-01-31 23:00 UTC (permalink / raw)
  To: linux-lvm@redhat.com

Hi,

  I am encountering strange behaviour when trying to recover a Raid 10 LV, created with the following command: 


lvcreate --type raid10 -L3G  -i 2 -I 256 -n lv_r10 vg_data /dev/vdb1:1-500 /dev/vdc1:1-500 /dev/vdd1:1-500 /dev/vde1:1-500

As it can be seen, I have 4 PVs and give the first 500 PE  of each of them for the raid 10 logical volume. I am able to see the following PE layout:

lvs  -o seg_pe_ranges,lv_name,stripes -a

PE Ranges                                                                               LV                #Str
lv_r10_rimage_0:0-767 lv_r10_rimage_1:0-767 lv_r10_rimage_2:0-767 lv_r10_rimage_3:0-767 lv_r10               4
/dev/vdb1:2-385                                                                         [lv_r10_rimage_0]    1
/dev/vdc1:2-385                                                                         [lv_r10_rimage_1]    1
/dev/vdd1:2-385                                                                         [lv_r10_rimage_2]    1
/dev/vde1:2-385                                                                         [lv_r10_rimage_3]    1
/dev/vdb1:1-1                                                                           [lv_r10_rmeta_0]     1
/dev/vdc1:1-1                                                                           [lv_r10_rmeta_1]     1
/dev/vdd1:1-1                                                                           [lv_r10_rmeta_2]     1
/dev/vde1:1-1                                                                           [lv_r10_rmeta_3]     1


 So far everything is OK and the number of PE is automatically reduced to 385 per PV  to have the size equal to 3 Gigabytes. 

 The problem comes when I shut down the system, replace one disk (vdc), boot again and try to recover the array. Here are the commands I execute:
pvs
Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC.
PV             VG         Fmt  Attr PSize  PFree
/dev/vdb1      vg_data    lvm2 a--   8.00g 6.49g
/dev/vdd1      vg_data    lvm2 a--   8.00g 6.49g
/dev/vde1      vg_data    lvm2 a--   8.00g 6.49g
unknown device vg_data    lvm2 a-m   8.00g 6.49g

pvcreate /dev/vdc1
Physical volume "/dev/vdc1" successfully created

vgextend vg_data /dev/vdc1
Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC
Volume group "vg_data" successfully extended

lvs  -o seg_pe_ranges,lv_name,stripes -a
Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC.
PE Ranges                                                                               LV                #Str
lv_r10_rimage_0:0-767 lv_r10_rimage_1:0-767 lv_r10_rimage_2:0-767 lv_r10_rimage_3:0-767 lv_r10               4
/dev/vdb1:2-385                                                                         [lv_r10_rimage_0]    1
unknown device:2-385                                                                    [lv_r10_rimage_1]    1
/dev/vdd1:2-385                                                                         [lv_r10_rimage_2]    1
/dev/vde1:2-385                                                                         [lv_r10_rimage_3]    1
/dev/vdb1:1-1                                                                           [lv_r10_rmeta_0]     1
unknown device:1-1                                                                      [lv_r10_rmeta_1]     1
/dev/vdd1:1-1                                                                           [lv_r10_rmeta_2]     1
/dev/vde1:1-1                                                                           [lv_r10_rmeta_3]     1

lvchange -ay --partial /dev/vg_data/lv_r10
PARTIAL MODE. Incomplete logical volumes will be processed.
Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC

lvconvert --repair vg_data/lv_r10 /dev/vdc1:1-385
Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
Insufficient free space: 770 extents needed, but only 345 available
Failed to allocate replacement images for vg_data/lv_r10

lvconvert --repair vg_data/lv_r10
Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
Faulty devices in vg_data/lv_r10 successfully replaced.

lvs  -o seg_pe_ranges,lv_name,stripes -a
Couldn't find device with uuid 
2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC.

PE Ranges                                                                               LV                #Str
lv_r10_rimage_0:0-767 lv_r10_rimage_1:0-767 lv_r10_rimage_2:0-767 lv_r10_rimage_3:0-767 lv_r10               4
/dev/vdb1:2-385                                                                         [lv_r10_rimage_0]    1
/dev/vdc1:1-768                                                                         [lv_r10_rimage_1]    1
/dev/vdd1:2-385                                                                         [lv_r10_rimage_2]    1
/dev/vde1:2-385                                                                         [lv_r10_rimage_3]    1
/dev/vdb1:1-1                                                                           [lv_r10_rmeta_0]     1
/dev/vdc1:0-0                                                                           [lv_r10_rmeta_1]     1
/dev/vdd1:1-1                                                                           [lv_r10_rmeta_2]     1
/dev/vde1:1-1                                                                           [lv_r10_rmeta_3]     1

  The array was recovered, but it is definitely not what I expected, because on /dev/vdc1 now 768 PEs are used instead of 385, like on the other PVs. In this case I had some extra free space on /dev/vdc1, but what if I did not? Please, suggest what should be done.

Linux ubuntu1 3.13.0-32-generic, x86_64
LVM version:     2.02.98(2) (2012-10-15)
Library version: 1.02.77 (2012-10-15)
Driver version:  4.27.0

Pavlik Petrov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Raid 10 - recovery after a disk failure
  2016-01-31 23:00 ` [linux-lvm] Raid 10 - recovery after a disk failure Pavlik Kirilov
@ 2016-01-31 23:23   ` emmanuel segura
  2016-02-01 17:06     ` Pavlik Kirilov
  0 siblings, 1 reply; 6+ messages in thread
From: emmanuel segura @ 2016-01-31 23:23 UTC (permalink / raw)
  To: Pavlik Kirilov, LVM general discussion and development

You used pvcreate without the required options to create the physical
in your new disk and set the old uuid


example:

pvcreate --uuid "FmGRh3-zhok-iVI8-7qTD-S5BI-MAEN-NYM5Sk" --restorefile
/etc/lvm/archive/VG_00050.vg /dev/sdh1

https://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/mdatarecover.html

2016-02-01 0:00 GMT+01:00 Pavlik Kirilov <pavllik@yahoo.ca>:
> Hi,
>
>   I am encountering strange behaviour when trying to recover a Raid 10 LV, created with the following command:
>
>
> lvcreate --type raid10 -L3G  -i 2 -I 256 -n lv_r10 vg_data /dev/vdb1:1-500 /dev/vdc1:1-500 /dev/vdd1:1-500 /dev/vde1:1-500
>
> As it can be seen, I have 4 PVs and give the first 500 PE  of each of them for the raid 10 logical volume. I am able to see the following PE layout:
>
> lvs  -o seg_pe_ranges,lv_name,stripes -a
>
> PE Ranges                                                                               LV                #Str
> lv_r10_rimage_0:0-767 lv_r10_rimage_1:0-767 lv_r10_rimage_2:0-767 lv_r10_rimage_3:0-767 lv_r10               4
> /dev/vdb1:2-385                                                                         [lv_r10_rimage_0]    1
> /dev/vdc1:2-385                                                                         [lv_r10_rimage_1]    1
> /dev/vdd1:2-385                                                                         [lv_r10_rimage_2]    1
> /dev/vde1:2-385                                                                         [lv_r10_rimage_3]    1
> /dev/vdb1:1-1                                                                           [lv_r10_rmeta_0]     1
> /dev/vdc1:1-1                                                                           [lv_r10_rmeta_1]     1
> /dev/vdd1:1-1                                                                           [lv_r10_rmeta_2]     1
> /dev/vde1:1-1                                                                           [lv_r10_rmeta_3]     1
>
>
>  So far everything is OK and the number of PE is automatically reduced to 385 per PV  to have the size equal to 3 Gigabytes.
>
>  The problem comes when I shut down the system, replace one disk (vdc), boot again and try to recover the array. Here are the commands I execute:
> pvs
> Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC.
> PV             VG         Fmt  Attr PSize  PFree
> /dev/vdb1      vg_data    lvm2 a--   8.00g 6.49g
> /dev/vdd1      vg_data    lvm2 a--   8.00g 6.49g
> /dev/vde1      vg_data    lvm2 a--   8.00g 6.49g
> unknown device vg_data    lvm2 a-m   8.00g 6.49g
>
> pvcreate /dev/vdc1
> Physical volume "/dev/vdc1" successfully created
>
> vgextend vg_data /dev/vdc1
> Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC
> Volume group "vg_data" successfully extended
>
> lvs  -o seg_pe_ranges,lv_name,stripes -a
> Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC.
> PE Ranges                                                                               LV                #Str
> lv_r10_rimage_0:0-767 lv_r10_rimage_1:0-767 lv_r10_rimage_2:0-767 lv_r10_rimage_3:0-767 lv_r10               4
> /dev/vdb1:2-385                                                                         [lv_r10_rimage_0]    1
> unknown device:2-385                                                                    [lv_r10_rimage_1]    1
> /dev/vdd1:2-385                                                                         [lv_r10_rimage_2]    1
> /dev/vde1:2-385                                                                         [lv_r10_rimage_3]    1
> /dev/vdb1:1-1                                                                           [lv_r10_rmeta_0]     1
> unknown device:1-1                                                                      [lv_r10_rmeta_1]     1
> /dev/vdd1:1-1                                                                           [lv_r10_rmeta_2]     1
> /dev/vde1:1-1                                                                           [lv_r10_rmeta_3]     1
>
> lvchange -ay --partial /dev/vg_data/lv_r10
> PARTIAL MODE. Incomplete logical volumes will be processed.
> Couldn't find device with uuid 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC
>
> lvconvert --repair vg_data/lv_r10 /dev/vdc1:1-385
> Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
> Insufficient free space: 770 extents needed, but only 345 available
> Failed to allocate replacement images for vg_data/lv_r10
>
> lvconvert --repair vg_data/lv_r10
> Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
> Faulty devices in vg_data/lv_r10 successfully replaced.
>
> lvs  -o seg_pe_ranges,lv_name,stripes -a
> Couldn't find device with uuid
> 2hU2pD-xNDa-yi1J-OkkP-NjGq-hIxo-Q5AgQC.
>
> PE Ranges                                                                               LV                #Str
> lv_r10_rimage_0:0-767 lv_r10_rimage_1:0-767 lv_r10_rimage_2:0-767 lv_r10_rimage_3:0-767 lv_r10               4
> /dev/vdb1:2-385                                                                         [lv_r10_rimage_0]    1
> /dev/vdc1:1-768                                                                         [lv_r10_rimage_1]    1
> /dev/vdd1:2-385                                                                         [lv_r10_rimage_2]    1
> /dev/vde1:2-385                                                                         [lv_r10_rimage_3]    1
> /dev/vdb1:1-1                                                                           [lv_r10_rmeta_0]     1
> /dev/vdc1:0-0                                                                           [lv_r10_rmeta_1]     1
> /dev/vdd1:1-1                                                                           [lv_r10_rmeta_2]     1
> /dev/vde1:1-1                                                                           [lv_r10_rmeta_3]     1
>
>   The array was recovered, but it is definitely not what I expected, because on /dev/vdc1 now 768 PEs are used instead of 385, like on the other PVs. In this case I had some extra free space on /dev/vdc1, but what if I did not? Please, suggest what should be done.
>
> Linux ubuntu1 3.13.0-32-generic, x86_64
> LVM version:     2.02.98(2) (2012-10-15)
> Library version: 1.02.77 (2012-10-15)
> Driver version:  4.27.0
>
> Pavlik Petrov
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Raid 10 - recovery after a disk failure
  2016-01-31 23:23   ` emmanuel segura
@ 2016-02-01 17:06     ` Pavlik Kirilov
  2016-02-01 18:22       ` emmanuel segura
  0 siblings, 1 reply; 6+ messages in thread
From: Pavlik Kirilov @ 2016-02-01 17:06 UTC (permalink / raw)
  To: LVM general discussion and development



The method for restoring raid 10, which I posted in my previous email, works very well for raid 5 on 4 PVs. I tried many times the "--uuid" method from the link you sent me and I always end up with destroyed data. Here comes the output of the tests I performed:

## Ubuntu VM with 4 new disks (qcow files) vda,vdb,vdc,vdd, one physical partition per disk.

vgcreate vg_data /dev/vda1 /dev/vdb1 /dev/vdc1 /dev/vdd1
Volume group "vg_data" successfully created

lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900
Logical volume "lv_r10" created

mkfs.ext4 /dev/vg_data/lv_r10

mount /dev/vg_data/lv_r10 /mnt/r10/

mount | grep vg_data
/dev/mapper/vg_data-lv_r10 on /mnt/r10 type ext4 (rw)

echo "some data" > /mnt/r10/testr10.txt

dmesg -T | tail -n 70

------------------

[ 3822.367551] EXT4-fs (dm-8): mounted filesystem with ordered data mode. Opts: (null)
[ 3851.317428] md: mdX: resync done.
[ 3851.440927] RAID10 conf printout:
[ 3851.440935]  --- wd:4 rd:4
[ 3851.440941]  disk 0, wo:0, o:1, dev:dm-1
[ 3851.440945]  disk 1, wo:0, o:1, dev:dm-3
[ 3851.440949]  disk 2, wo:0, o:1, dev:dm-5
[ 3851.440953]  disk 3, wo:0, o:1, dev:dm-7


lvs -a -o +devices
LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices 
lv_r10            vg_data rwi-aor-- 3.00g                             100.00         lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data iwi-aor-- 1.50g                                            /dev/vda1(2) 
[lv_r10_rimage_1] vg_data iwi-aor-- 1.50g                                            /dev/vdb1(2) 
[lv_r10_rimage_2] vg_data iwi-aor-- 1.50g                                            /dev/vdc1(2) 
[lv_r10_rimage_3] vg_data iwi-aor-- 1.50g                                            /dev/vdd1(2) 
[lv_r10_rmeta_0]  vg_data ewi-aor-- 4.00m                                            /dev/vda1(1) 
[lv_r10_rmeta_1]  vg_data ewi-aor-- 4.00m                                            /dev/vdb1(1) 
[lv_r10_rmeta_2]  vg_data ewi-aor-- 4.00m                                            /dev/vdc1(1) 
[lv_r10_rmeta_3]  vg_data ewi-aor-- 4.00m                                            /dev/vdd1(1)
###

### Shutting down, replacing vdb with a new disk, starting the system ###

### 
lvs -a -o +devices
Couldn't find device with uuid GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp.
LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices 
lv_r10            vg_data rwi---r-p 3.00g                                            lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data Iwi---r-- 1.50g                                            /dev/vda1(2) 
[lv_r10_rimage_1] vg_data Iwi---r-p 1.50g                                            unknown device(2) 
[lv_r10_rimage_2] vg_data Iwi---r-- 1.50g                                            /dev/vdc1(2) 
[lv_r10_rimage_3] vg_data Iwi---r-- 1.50g                                            /dev/vdd1(2) 
[lv_r10_rmeta_0]  vg_data ewi---r-- 4.00m                                            /dev/vda1(1) 
[lv_r10_rmeta_1]  vg_data ewi---r-p 4.00m                                            unknown device(1) 
[lv_r10_rmeta_2]  vg_data ewi---r-- 4.00m                                            /dev/vdc1(1) 
[lv_r10_rmeta_3]  vg_data ewi---r-- 4.00m                                            /dev/vdd1(1)

grep description /etc/lvm/backup/vg_data
description = "Created *after* executing 'lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900'"

pvcreate --uuid  GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp  --restorefile /etc/lvm/backup/vg_data /dev/vdb1
Couldn't find device with uuid GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp.
Physical volume "/dev/vdb1" successfully created

vgcfgrestore vg_data
Restored volume group vg_data

lvs -a -o +devices
LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices 
lv_r10            vg_data rwi-d-r-- 3.00g                               0.00         lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data iwi-a-r-- 1.50g                                            /dev/vda1(2) 
[lv_r10_rimage_1] vg_data iwi-a-r-- 1.50g                                            /dev/vdb1(2) 
[lv_r10_rimage_2] vg_data iwi-a-r-- 1.50g                                            /dev/vdc1(2) 
[lv_r10_rimage_3] vg_data iwi-a-r-- 1.50g                                            /dev/vdd1(2) 
[lv_r10_rmeta_0]  vg_data ewi-a-r-- 4.00m                                            /dev/vda1(1) 
[lv_r10_rmeta_1]  vg_data ewi-a-r-- 4.00m                                            /dev/vdb1(1) 
[lv_r10_rmeta_2]  vg_data ewi-a-r-- 4.00m                                            /dev/vdc1(1) 
[lv_r10_rmeta_3]  vg_data ewi-a-r-- 4.00m                                            /dev/vdd1(1)

lvchange --resync vg_data/lv_r10
Do you really want to deactivate logical volume lv_r10 to resync it? [y/n]: y

lvs -a -o +devices
LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices 
lv_r10            vg_data rwi-a-r-- 3.00g              100.00
---------

dmesg | tail
------------
[  708.691297] md: mdX: resync done.
[  708.765376] RAID10 conf printout:
[  708.765379]  --- wd:4 rd:4
[  708.765381]  disk 0, wo:0, o:1, dev:dm-1
[  708.765382]  disk 1, wo:0, o:1, dev:dm-3
[  708.765383]  disk 2, wo:0, o:1, dev:dm-5
[  708.765384]  disk 3, wo:0, o:1, dev:dm-7

mount /dev/vg_data/lv_r10 /mnt/r10/
cat /mnt/r10/testr10.txt 
some data

### Suppose now that vda must be replaced too. 
### Shutting down again, replacing vda with a new disk, starting the system ###

lvs -a -o +devices
Couldn't find device with uuid KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2.
LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices 
lv_r10            vg_data rwi---r-p 3.00g                                            lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data Iwi---r-p 1.50g                                            unknown device(2) 
[lv_r10_rimage_1] vg_data Iwi---r-- 1.50g                                            /dev/vdb1(2) 
[lv_r10_rimage_2] vg_data Iwi---r-- 1.50g                                            /dev/vdc1(2) 
[lv_r10_rimage_3] vg_data Iwi---r-- 1.50g                                            /dev/vdd1(2) 
[lv_r10_rmeta_0]  vg_data ewi---r-p 4.00m                                            unknown device(1) 
[lv_r10_rmeta_1]  vg_data ewi---r-- 4.00m                                            /dev/vdb1(1) 
[lv_r10_rmeta_2]  vg_data ewi---r-- 4.00m                                            /dev/vdc1(1) 
[lv_r10_rmeta_3]  vg_data ewi---r-- 4.00m                                            /dev/vdd1(1)


grep description /etc/lvm/backup/vg_data
description = "Created *after* executing 'vgscan'"
pvcreate --uuid  KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2  --restorefile /etc/lvm/backup/vg_data /dev/vda1
Couldn't find device with uuid KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2.
Physical volume "/dev/vda1" successfully created

vgcfgrestore vg_data
Restored volume group vg_data

lvchange --resync vg_data/lv_r10
Do you really want to deactivate logical volume lv_r10 to resync it? [y/n]: y

lvs -a -o +devices
LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices 
lv_r10            vg_data rwi-a-r-- 3.00g                             100.00         lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
[lv_r10_rimage_0] vg_data iwi-aor-- 1.50g                                            /dev/vda1(2) 
[lv_r10_rimage_1] vg_data iwi-aor-- 1.50g                                            /dev/vdb1(2) 
[lv_r10_rimage_2] vg_data iwi-aor-- 1.50g                                            /dev/vdc1(2) 
[lv_r10_rimage_3] vg_data iwi-aor-- 1.50g                                            /dev/vdd1(2) 
[lv_r10_rmeta_0]  vg_data ewi-aor-- 4.00m                                            /dev/vda1(1) 
[lv_r10_rmeta_1]  vg_data ewi-aor-- 4.00m                                            /dev/vdb1(1) 
[lv_r10_rmeta_2]  vg_data ewi-aor-- 4.00m                                            /dev/vdc1(1) 
[lv_r10_rmeta_3]  vg_data ewi-aor-- 4.00m                                            /dev/vdd1(1)

mount -t ext4 /dev/vg_data/lv_r10 /mnt/r10/
mount: wrong fs type, bad option, bad superblock on /dev/mapper/vg_data-lv_r10,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so

dmesg | tail
-------------
[  715.361985] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
[  715.362248] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
[  715.362548] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
[  715.362846] FAT-fs (dm-8): bogus number of reserved sectors
[  715.362933] FAT-fs (dm-8): Can't find a valid FAT filesystem
[  729.843473] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem

As you can see, after more then one disk failure and raid repair , I lost the file system on the raid 10 volume. Please, suggest what I am doing wrong.Thanks.

Pavlik 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Raid 10 - recovery after a disk failure
  2016-02-01 17:06     ` Pavlik Kirilov
@ 2016-02-01 18:22       ` emmanuel segura
       [not found]         ` <1744577429.81266.1454367573843.JavaMail.yahoo@mail.yahoo.com>
  0 siblings, 1 reply; 6+ messages in thread
From: emmanuel segura @ 2016-02-01 18:22 UTC (permalink / raw)
  To: Pavlik Kirilov, LVM general discussion and development

please can you retry using:

       --[raid]syncaction {check|repair}
              This argument is used to initiate various RAID
synchronization operations.  The check and repair options provide a
way to check the integrity of a RAID log‐
              ical  volume  (often  referred  to  as "scrubbing").
These options cause the RAID logical volume to read all of the data
and parity blocks in the array and
              check for any discrepancies (e.g. mismatches between
mirrors or incorrect parity values).  If check is used, the
discrepancies  will  be  counted  but  not
              repaired.   If  repair is used, the discrepancies will
be corrected as they are encountered.  The 'lvs' command can be used
to show the number of discrepan‐
              cies found or repaired.

maybe --resync is for mirror and you are using raid

      --resync
              Forces  the  complete resynchronization of a mirror.  In
normal circumstances you should not need this option because
synchronization happens automatically.
              Data is read from the primary mirror device and copied
to the others, so this can take a considerable amount of time - and
during this time you are  without
              a complete redundant copy of your data.

2016-02-01 18:06 GMT+01:00 Pavlik Kirilov <pavllik@yahoo.ca>:
>
>
> The method for restoring raid 10, which I posted in my previous email, works very well for raid 5 on 4 PVs. I tried many times the "--uuid" method from the link you sent me and I always end up with destroyed data. Here comes the output of the tests I performed:
>
> ## Ubuntu VM with 4 new disks (qcow files) vda,vdb,vdc,vdd, one physical partition per disk.
>
> vgcreate vg_data /dev/vda1 /dev/vdb1 /dev/vdc1 /dev/vdd1
> Volume group "vg_data" successfully created
>
> lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900
> Logical volume "lv_r10" created
>
> mkfs.ext4 /dev/vg_data/lv_r10
>
> mount /dev/vg_data/lv_r10 /mnt/r10/
>
> mount | grep vg_data
> /dev/mapper/vg_data-lv_r10 on /mnt/r10 type ext4 (rw)
>
> echo "some data" > /mnt/r10/testr10.txt
>
> dmesg -T | tail -n 70
>
> ------------------
>
> [ 3822.367551] EXT4-fs (dm-8): mounted filesystem with ordered data mode. Opts: (null)
> [ 3851.317428] md: mdX: resync done.
> [ 3851.440927] RAID10 conf printout:
> [ 3851.440935]  --- wd:4 rd:4
> [ 3851.440941]  disk 0, wo:0, o:1, dev:dm-1
> [ 3851.440945]  disk 1, wo:0, o:1, dev:dm-3
> [ 3851.440949]  disk 2, wo:0, o:1, dev:dm-5
> [ 3851.440953]  disk 3, wo:0, o:1, dev:dm-7
>
>
> lvs -a -o +devices
> LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices
> lv_r10            vg_data rwi-aor-- 3.00g                             100.00         lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
> [lv_r10_rimage_0] vg_data iwi-aor-- 1.50g                                            /dev/vda1(2)
> [lv_r10_rimage_1] vg_data iwi-aor-- 1.50g                                            /dev/vdb1(2)
> [lv_r10_rimage_2] vg_data iwi-aor-- 1.50g                                            /dev/vdc1(2)
> [lv_r10_rimage_3] vg_data iwi-aor-- 1.50g                                            /dev/vdd1(2)
> [lv_r10_rmeta_0]  vg_data ewi-aor-- 4.00m                                            /dev/vda1(1)
> [lv_r10_rmeta_1]  vg_data ewi-aor-- 4.00m                                            /dev/vdb1(1)
> [lv_r10_rmeta_2]  vg_data ewi-aor-- 4.00m                                            /dev/vdc1(1)
> [lv_r10_rmeta_3]  vg_data ewi-aor-- 4.00m                                            /dev/vdd1(1)
> ###
>
> ### Shutting down, replacing vdb with a new disk, starting the system ###
>
> ###
> lvs -a -o +devices
> Couldn't find device with uuid GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp.
> LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices
> lv_r10            vg_data rwi---r-p 3.00g                                            lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
> [lv_r10_rimage_0] vg_data Iwi---r-- 1.50g                                            /dev/vda1(2)
> [lv_r10_rimage_1] vg_data Iwi---r-p 1.50g                                            unknown device(2)
> [lv_r10_rimage_2] vg_data Iwi---r-- 1.50g                                            /dev/vdc1(2)
> [lv_r10_rimage_3] vg_data Iwi---r-- 1.50g                                            /dev/vdd1(2)
> [lv_r10_rmeta_0]  vg_data ewi---r-- 4.00m                                            /dev/vda1(1)
> [lv_r10_rmeta_1]  vg_data ewi---r-p 4.00m                                            unknown device(1)
> [lv_r10_rmeta_2]  vg_data ewi---r-- 4.00m                                            /dev/vdc1(1)
> [lv_r10_rmeta_3]  vg_data ewi---r-- 4.00m                                            /dev/vdd1(1)
>
> grep description /etc/lvm/backup/vg_data
> description = "Created *after* executing 'lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900'"
>
> pvcreate --uuid  GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp  --restorefile /etc/lvm/backup/vg_data /dev/vdb1
> Couldn't find device with uuid GjkgzF-18Ls-321G-SaDW-4vp0-d04y-Gd4xRp.
> Physical volume "/dev/vdb1" successfully created
>
> vgcfgrestore vg_data
> Restored volume group vg_data
>
> lvs -a -o +devices
> LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices
> lv_r10            vg_data rwi-d-r-- 3.00g                               0.00         lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
> [lv_r10_rimage_0] vg_data iwi-a-r-- 1.50g                                            /dev/vda1(2)
> [lv_r10_rimage_1] vg_data iwi-a-r-- 1.50g                                            /dev/vdb1(2)
> [lv_r10_rimage_2] vg_data iwi-a-r-- 1.50g                                            /dev/vdc1(2)
> [lv_r10_rimage_3] vg_data iwi-a-r-- 1.50g                                            /dev/vdd1(2)
> [lv_r10_rmeta_0]  vg_data ewi-a-r-- 4.00m                                            /dev/vda1(1)
> [lv_r10_rmeta_1]  vg_data ewi-a-r-- 4.00m                                            /dev/vdb1(1)
> [lv_r10_rmeta_2]  vg_data ewi-a-r-- 4.00m                                            /dev/vdc1(1)
> [lv_r10_rmeta_3]  vg_data ewi-a-r-- 4.00m                                            /dev/vdd1(1)
>
> lvchange --resync vg_data/lv_r10
> Do you really want to deactivate logical volume lv_r10 to resync it? [y/n]: y
>
> lvs -a -o +devices
> LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices
> lv_r10            vg_data rwi-a-r-- 3.00g              100.00
> ---------
>
> dmesg | tail
> ------------
> [  708.691297] md: mdX: resync done.
> [  708.765376] RAID10 conf printout:
> [  708.765379]  --- wd:4 rd:4
> [  708.765381]  disk 0, wo:0, o:1, dev:dm-1
> [  708.765382]  disk 1, wo:0, o:1, dev:dm-3
> [  708.765383]  disk 2, wo:0, o:1, dev:dm-5
> [  708.765384]  disk 3, wo:0, o:1, dev:dm-7
>
> mount /dev/vg_data/lv_r10 /mnt/r10/
> cat /mnt/r10/testr10.txt
> some data
>
> ### Suppose now that vda must be replaced too.
> ### Shutting down again, replacing vda with a new disk, starting the system ###
>
> lvs -a -o +devices
> Couldn't find device with uuid KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2.
> LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices
> lv_r10            vg_data rwi---r-p 3.00g                                            lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
> [lv_r10_rimage_0] vg_data Iwi---r-p 1.50g                                            unknown device(2)
> [lv_r10_rimage_1] vg_data Iwi---r-- 1.50g                                            /dev/vdb1(2)
> [lv_r10_rimage_2] vg_data Iwi---r-- 1.50g                                            /dev/vdc1(2)
> [lv_r10_rimage_3] vg_data Iwi---r-- 1.50g                                            /dev/vdd1(2)
> [lv_r10_rmeta_0]  vg_data ewi---r-p 4.00m                                            unknown device(1)
> [lv_r10_rmeta_1]  vg_data ewi---r-- 4.00m                                            /dev/vdb1(1)
> [lv_r10_rmeta_2]  vg_data ewi---r-- 4.00m                                            /dev/vdc1(1)
> [lv_r10_rmeta_3]  vg_data ewi---r-- 4.00m                                            /dev/vdd1(1)
>
>
> grep description /etc/lvm/backup/vg_data
> description = "Created *after* executing 'vgscan'"
> pvcreate --uuid  KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2  --restorefile /etc/lvm/backup/vg_data /dev/vda1
> Couldn't find device with uuid KGf6QK-1LrJ-JDaA-bJJY-pmLb-l9eV-LEXgT2.
> Physical volume "/dev/vda1" successfully created
>
> vgcfgrestore vg_data
> Restored volume group vg_data
>
> lvchange --resync vg_data/lv_r10
> Do you really want to deactivate logical volume lv_r10 to resync it? [y/n]: y
>
> lvs -a -o +devices
> LV                VG      Attr      LSize Pool Origin Data%  Move Log Copy%  Convert Devices
> lv_r10            vg_data rwi-a-r-- 3.00g                             100.00         lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)
> [lv_r10_rimage_0] vg_data iwi-aor-- 1.50g                                            /dev/vda1(2)
> [lv_r10_rimage_1] vg_data iwi-aor-- 1.50g                                            /dev/vdb1(2)
> [lv_r10_rimage_2] vg_data iwi-aor-- 1.50g                                            /dev/vdc1(2)
> [lv_r10_rimage_3] vg_data iwi-aor-- 1.50g                                            /dev/vdd1(2)
> [lv_r10_rmeta_0]  vg_data ewi-aor-- 4.00m                                            /dev/vda1(1)
> [lv_r10_rmeta_1]  vg_data ewi-aor-- 4.00m                                            /dev/vdb1(1)
> [lv_r10_rmeta_2]  vg_data ewi-aor-- 4.00m                                            /dev/vdc1(1)
> [lv_r10_rmeta_3]  vg_data ewi-aor-- 4.00m                                            /dev/vdd1(1)
>
> mount -t ext4 /dev/vg_data/lv_r10 /mnt/r10/
> mount: wrong fs type, bad option, bad superblock on /dev/mapper/vg_data-lv_r10,
> missing codepage or helper program, or other error
> In some cases useful info is found in syslog - try
> dmesg | tail  or so
>
> dmesg | tail
> -------------
> [  715.361985] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
> [  715.362248] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
> [  715.362548] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
> [  715.362846] FAT-fs (dm-8): bogus number of reserved sectors
> [  715.362933] FAT-fs (dm-8): Can't find a valid FAT filesystem
> [  729.843473] EXT4-fs (dm-8): VFS: Can't find ext4 filesystem
>
> As you can see, after more then one disk failure and raid repair , I lost the file system on the raid 10 volume. Please, suggest what I am doing wrong.Thanks.
>
> Pavlik
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <1744577429.81266.1454367573843.JavaMail.yahoo@mail.yahoo.com>]

* Re: [linux-lvm] Raid 10 - recovery after a disk failure
       [not found]         ` <1744577429.81266.1454367573843.JavaMail.yahoo@mail.yahoo.com>
@ 2016-02-02 16:50           ` emmanuel segura
  2016-02-02 23:20             ` Pavlik Kirilov
  0 siblings, 1 reply; 6+ messages in thread
From: emmanuel segura @ 2016-02-02 16:50 UTC (permalink / raw)
  To: LVM general discussion and development

Using the lvm version "LVM version:     2.02.141(2)-git (2016-01-25)"
the space is not begin allocate twice when you try to recover or
replace the failed device with a new one.

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=b33d7586e7f629818e881e26677f4431a47d50b5

Anyway if you have a failing disk, you can recover in this way:

lvs  -o seg_pe_ranges,lv_name,stripes -a vgraid10
  PE Ranges
                   LV                #Str
  lv_r10_rimage_0:0-255 lv_r10_rimage_1:0-255 lv_r10_rimage_2:0-255
lv_r10_rimage_3:0-255 lv_r10               4
  /dev/sdb:1-256
                   [lv_r10_rimage_0]    1
  /dev/sdc:1-256
                   [lv_r10_rimage_1]    1
  /dev/sdd:1-256
                   [lv_r10_rimage_2]    1
  /dev/sde:1-256
                   [lv_r10_rimage_3]    1
  /dev/sdb:0-0
                   [lv_r10_rmeta_0]     1
  /dev/sdc:0-0
                   [lv_r10_rmeta_1]     1
  /dev/sdd:0-0
                   [lv_r10_rmeta_2]     1
  /dev/sde:0-0
                   [lv_r10_rmeta_3]     1

echo 1 > /sys/block/sdb/device/delete

vgextend vgraid10 /dev/sdf
lvconvert --repair vgraid10/lv_r10 /dev/sdf



lvs  -o seg_pe_ranges,lv_name,stripes -a vgraid10
  Couldn't find device with uuid zEcc1n-172G-lNA9-ucC2-JJRx-kZnX-xx7tAW.
  PE Ranges
                   LV                #Str
  lv_r10_rimage_0:0-255 lv_r10_rimage_1:0-255 lv_r10_rimage_2:0-255
lv_r10_rimage_3:0-255 lv_r10               4
  /dev/sdf:1-256
                   [lv_r10_rimage_0]    1
  /dev/sdc:1-256
                   [lv_r10_rimage_1]    1
  /dev/sdd:1-256
                   [lv_r10_rimage_2]    1
  /dev/sde:1-256
                   [lv_r10_rimage_3]    1
  /dev/sdf:0-0
                   [lv_r10_rmeta_0]     1
  /dev/sdc:0-0
                   [lv_r10_rmeta_1]     1
  /dev/sdd:0-0
                   [lv_r10_rmeta_2]     1
  /dev/sde:0-0
                   [lv_r10_rmeta_3]     1


2016-02-01 23:59 GMT+01:00 Pavlik Kirilov <pavllik@yahoo.ca>:
> I upgraded lvm, because --syncaction was not available in the old one:
> New lvm version     2.02.111(2) (2014-09-01)
> Library version: 1.02.90 (2014-09-01)
> Driver version:  4.27.0
>
> However it seems to me that the command "lvchange --syncaction repair vg_data/lv_r10" does not perform a resync if there are no mismatches.
> Here is the test output:
>
> ## Created the same LV as before, shut down, replace disk, start up again ###
>
> lvs -a -o +devices,raid_mismatch_count,raid_sync_action
> Couldn't find device with uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt.
> LV                VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                                                     Mismatches SyncAction
> lv_r10            vg_data rwi-a-r-p- 3.00g                                    100.00           lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)          0 idle
> [lv_r10_rimage_0] vg_data iwi-aor--- 1.50g                                                     /dev/vda1(2)
> [lv_r10_rimage_1] vg_data iwi-a-r-p- 1.50g                                                     unknown device(2)
> [lv_r10_rimage_2] vg_data iwi-aor--- 1.50g                                                     /dev/vdc1(2)
> [lv_r10_rimage_3] vg_data iwi-aor--- 1.50g                                                     /dev/vdd1(2)
> [lv_r10_rmeta_0]  vg_data ewi-aor--- 4.00m                                                     /dev/vda1(1)
> [lv_r10_rmeta_1]  vg_data ewi-a-r-p- 4.00m                                                     unknown device(1)
> [lv_r10_rmeta_2]  vg_data ewi-aor--- 4.00m                                                     /dev/vdc1(1)
> [lv_r10_rmeta_3]  vg_data ewi-aor--- 4.00m                                                     /dev/vdd1(1)
>
> pvs
> Couldn't find device with uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt.
> PV             VG      Fmt  Attr PSize PFree
> /dev/vda1      vg_data lvm2 a--  8.00g 6.49g
> /dev/vdc1      vg_data lvm2 a--  8.00g 6.49g
> /dev/vdd1      vg_data lvm2 a--  8.00g 6.49g
> unknown device vg_data lvm2 a-m  8.00g 6.49g
>
> grep description /etc/lvm/backup/vg_data
> description = "Created *after* executing 'lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900'"
>
> pvcreate --uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt  --restorefile /etc/lvm/backup/vg_data /dev/vdb1
> Couldn't find device with uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt.
> Physical volume "/dev/vdb1" successfully created
>
> vgcfgrestore vg_data
> Restored volume group vg_data
>
> lvchange --syncaction repair vg_data/lv_r10
>
> dmesg | tail
> ------------
> [  324.454722] md: requested-resync of RAID array mdX
> [  324.454725] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [  324.454727] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for requested-resync.
> [  324.454729] md: using 128k window, over a total of 3145728k.
> [  324.454882] md: mdX: requested-resync done.
>
> ### Here I think the new PV did not receive any data
> ### shut down , replace vda , start the system###
>
> grep description /etc/lvm/backup/vg_data
> description = "Created *after* executing 'vgscan'"
>
> pvcreate --uuid  zjJVEj-VIKe-oe0Z-W1CF-edfj-16n2-oiAyID --restorefile /etc/lvm/backup/vg_data /dev/vda1
> Couldn't find device with uuid zjJVEj-VIKe-oe0Z-W1CF-edfj-16n2-oiAyID.
> Physical volume "/dev/vda1" successfully created
>
> vgcfgrestore vg_data
> Restored volume group vg_data
>
> lvchange --syncaction repair vg_data/lv_r10
> Unable to send message to an inactive logical volume.
>
> dmesg | tail
> -------------
> [  374.959535] device-mapper: raid: Failed to read superblock of device at position 0
> [  374.959577] device-mapper: raid: New device injected into existing array without 'rebuild' parameter specified
> [  374.959621] device-mapper: table: 252:10: raid: Unable to assemble array: Invalid superblocks
> [  374.959656] device-mapper: ioctl: error adding target to table
>
>
> ### I gave another try to my previous procedure with "lvchange --resync vg_data/lv_r10", but this again destroyed the file system, even now with the newer LVM version.
> ### Also the test with lvconvert --repair produced the same result as before.
>
> Please, advice.
>
> Pavlik
>
>
> ----- Original Message -----
> From: emmanuel segura <emi2fast@gmail.com>
> To: Pavlik Kirilov <pavllik@yahoo.ca>; LVM general discussion and development <linux-lvm@redhat.com>
> Sent: Monday, February 1, 2016 1:22 PM
> Subject: Re: [linux-lvm] Raid 10 - recovery after a disk failure
>
> please can you retry using:
>
>        --[raid]syncaction {check|repair}
>               This argument is used to initiate various RAID
> synchronization operations.  The check and repair options provide a
> way to check the integrity of a RAID log‐
>               ical  volume  (often  referred  to  as "scrubbing").
> These options cause the RAID logical volume to read all of the data
> and parity blocks in the array and
>               check for any discrepancies (e.g. mismatches between
> mirrors or incorrect parity values).  If check is used, the
> discrepancies  will  be  counted  but  not
>               repaired.   If  repair is used, the discrepancies will
> be corrected as they are encountered.  The 'lvs' command can be used
> to show the number of discrepan‐
>               cies found or repaired.
>
> maybe --resync is for mirror and you are using raid
>
>       --resync
>               Forces  the  complete resynchronization of a mirror.  In
> normal circumstances you should not need this option because
> synchronization happens automatically.
>               Data is read from the primary mirror device and copied
> to the others, so this can take a considerable amount of time - and
> during this time you are  without
>               a complete redundant copy of your data.
>
> 2016-02-01 18:06 GMT+01:00 Pavlik Kirilov <pavllik@yahoo.ca>:



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Raid 10 - recovery after a disk failure
  2016-02-02 16:50           ` emmanuel segura
@ 2016-02-02 23:20             ` Pavlik Kirilov
  0 siblings, 0 replies; 6+ messages in thread
From: Pavlik Kirilov @ 2016-02-02 23:20 UTC (permalink / raw)
  To: LVM general discussion and development

Thank you for helping me out with this. I am glad the issue was fixed in the newer version of LVM.

Regards,

Pavlik




----- Original Message -----
From: emmanuel segura <emi2fast@gmail.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Sent: Tuesday, February 2, 2016 11:50 AM
Subject: Re: [linux-lvm] Raid 10 - recovery after a disk failure

Using the lvm version "LVM version:     2.02.141(2)-git (2016-01-25)"
the space is not begin allocate twice when you try to recover or
replace the failed device with a new one.

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=b33d7586e7f629818e881e26677f4431a47d50b5

Anyway if you have a failing disk, you can recover in this way:

lvs  -o seg_pe_ranges,lv_name,stripes -a vgraid10
  PE Ranges
                   LV                #Str
  lv_r10_rimage_0:0-255 lv_r10_rimage_1:0-255 lv_r10_rimage_2:0-255
lv_r10_rimage_3:0-255 lv_r10               4
  /dev/sdb:1-256
                   [lv_r10_rimage_0]    1
  /dev/sdc:1-256
                   [lv_r10_rimage_1]    1
  /dev/sdd:1-256
                   [lv_r10_rimage_2]    1
  /dev/sde:1-256
                   [lv_r10_rimage_3]    1
  /dev/sdb:0-0
                   [lv_r10_rmeta_0]     1
  /dev/sdc:0-0
                   [lv_r10_rmeta_1]     1
  /dev/sdd:0-0
                   [lv_r10_rmeta_2]     1
  /dev/sde:0-0
                   [lv_r10_rmeta_3]     1

echo 1 > /sys/block/vdb/device/delete

vgextend vgraid10 /dev/sdf
lvconvert --repair vgraid10/lv_r10 /dev/sdf



lvs  -o seg_pe_ranges,lv_name,stripes -a vgraid10
  Couldn't find device with uuid zEcc1n-172G-lNA9-ucC2-JJRx-kZnX-xx7tAW.
  PE Ranges
                   LV                #Str
  lv_r10_rimage_0:0-255 lv_r10_rimage_1:0-255 lv_r10_rimage_2:0-255
lv_r10_rimage_3:0-255 lv_r10               4
  /dev/sdf:1-256
                   [lv_r10_rimage_0]    1
  /dev/sdc:1-256
                   [lv_r10_rimage_1]    1
  /dev/sdd:1-256
                   [lv_r10_rimage_2]    1
  /dev/sde:1-256
                   [lv_r10_rimage_3]    1
  /dev/sdf:0-0
                   [lv_r10_rmeta_0]     1
  /dev/sdc:0-0
                   [lv_r10_rmeta_1]     1
  /dev/sdd:0-0
                   [lv_r10_rmeta_2]     1
  /dev/sde:0-0
                   [lv_r10_rmeta_3]     1


2016-02-01 23:59 GMT+01:00 Pavlik Kirilov <pavllik@yahoo.ca>:
> I upgraded lvm, because --syncaction was not available in the old one:
> New lvm version     2.02.111(2) (2014-09-01)
> Library version: 1.02.90 (2014-09-01)
> Driver version:  4.27.0
>
> However it seems to me that the command "lvchange --syncaction repair vg_data/lv_r10" does not perform a resync if there are no mismatches.
> Here is the test output:
>
> ## Created the same LV as before, shut down, replace disk, start up again ###
>
> lvs -a -o +devices,raid_mismatch_count,raid_sync_action
> Couldn't find device with uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt.
> LV                VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                                                     Mismatches SyncAction
> lv_r10            vg_data rwi-a-r-p- 3.00g                                    100.00           lv_r10_rimage_0(0),lv_r10_rimage_1(0),lv_r10_rimage_2(0),lv_r10_rimage_3(0)          0 idle
> [lv_r10_rimage_0] vg_data iwi-aor--- 1.50g                                                     /dev/vda1(2)
> [lv_r10_rimage_1] vg_data iwi-a-r-p- 1.50g                                                     unknown device(2)
> [lv_r10_rimage_2] vg_data iwi-aor--- 1.50g                                                     /dev/vdc1(2)
> [lv_r10_rimage_3] vg_data iwi-aor--- 1.50g                                                     /dev/vdd1(2)
> [lv_r10_rmeta_0]  vg_data ewi-aor--- 4.00m                                                     /dev/vda1(1)
> [lv_r10_rmeta_1]  vg_data ewi-a-r-p- 4.00m                                                     unknown device(1)
> [lv_r10_rmeta_2]  vg_data ewi-aor--- 4.00m                                                     /dev/vdc1(1)
> [lv_r10_rmeta_3]  vg_data ewi-aor--- 4.00m                                                     /dev/vdd1(1)
>
> pvs
> Couldn't find device with uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt.
> PV             VG      Fmt  Attr PSize PFree
> /dev/vda1      vg_data lvm2 a--  8.00g 6.49g
> /dev/vdc1      vg_data lvm2 a--  8.00g 6.49g
> /dev/vdd1      vg_data lvm2 a--  8.00g 6.49g
> unknown device vg_data lvm2 a-m  8.00g 6.49g
>
> grep description /etc/lvm/backup/vg_data
> description = "Created *after* executing 'lvcreate --type raid10 -L3g -i 2 -I 256 -n lv_r10 vg_data /dev/vda1:1-900 /dev/vdb1:1-900 /dev/vdc1:1-900 /dev/vdd1:1-900'"
>
> pvcreate --uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt  --restorefile /etc/lvm/backup/vg_data /dev/vdb1
> Couldn't find device with uuid f0M1di-y7Fy-TZZg-3RO3-kJsA-fWEi-MoD2mt.
> Physical volume "/dev/vdb1" successfully created
>
> vgcfgrestore vg_data
> Restored volume group vg_data
>
> lvchange --syncaction repair vg_data/lv_r10
>
> dmesg | tail
> ------------
> [  324.454722] md: requested-resync of RAID array mdX
> [  324.454725] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [  324.454727] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for requested-resync.
> [  324.454729] md: using 128k window, over a total of 3145728k.
> [  324.454882] md: mdX: requested-resync done.
>
> ### Here I think the new PV did not receive any data
> ### shut down , replace vda , start the system###
>
> grep description /etc/lvm/backup/vg_data
> description = "Created *after* executing 'vgscan'"
>
> pvcreate --uuid  zjJVEj-VIKe-oe0Z-W1CF-edfj-16n2-oiAyID --restorefile /etc/lvm/backup/vg_data /dev/vda1
> Couldn't find device with uuid zjJVEj-VIKe-oe0Z-W1CF-edfj-16n2-oiAyID.
> Physical volume "/dev/vda1" successfully created
>
> vgcfgrestore vg_data
> Restored volume group vg_data
>
> lvchange --syncaction repair vg_data/lv_r10
> Unable to send message to an inactive logical volume.
>
> dmesg | tail
> -------------
> [  374.959535] device-mapper: raid: Failed to read superblock of device at position 0
> [  374.959577] device-mapper: raid: New device injected into existing array without 'rebuild' parameter specified
> [  374.959621] device-mapper: table: 252:10: raid: Unable to assemble array: Invalid superblocks
> [  374.959656] device-mapper: ioctl: error adding target to table
>
>
> ### I gave another try to my previous procedure with "lvchange --resync vg_data/lv_r10", but this again destroyed the file system, even now with the newer LVM version.
> ### Also the test with lvconvert --repair produced the same result as before.
>
> Please, advice.
>
> Pavlik
>
>
> ----- Original Message -----
> From: emmanuel segura <emi2fast@gmail.com>
> To: Pavlik Kirilov <pavllik@yahoo.ca>; LVM general discussion and development <linux-lvm@redhat.com>
> Sent: Monday, February 1, 2016 1:22 PM
> Subject: Re: [linux-lvm] Raid 10 - recovery after a disk failure
>
> please can you retry using:
>
>        --[raid]syncaction {check|repair}
>               This argument is used to initiate various RAID
> synchronization operations.  The check and repair options provide a
> way to check the integrity of a RAID log‐
>               ical  volume  (often  referred  to  as "scrubbing").
> These options cause the RAID logical volume to read all of the data
> and parity blocks in the array and
>               check for any discrepancies (e.g. mismatches between
> mirrors or incorrect parity values).  If check is used, the
> discrepancies  will  be  counted  but  not
>               repaired.   If  repair is used, the discrepancies will
> be corrected as they are encountered.  The 'lvs' command can be used
> to show the number of discrepan‐
>               cies found or repaired.
>
> maybe --resync is for mirror and you are using raid
>
>       --resync
>               Forces  the  complete resynchronization of a mirror.  In
> normal circumstances you should not need this option because
> synchronization happens automatically.
>               Data is read from the primary mirror device and copied
> to the others, so this can take a considerable amount of time - and
> during this time you are  without
>               a complete redundant copy of your data.
>
> 2016-02-01 18:06 GMT+01:00 Pavlik Kirilov <pavllik@yahoo.ca>:



-- 
  .~.
  /V\
//  \\
/(   )\
^`~'^

_______________________________________________
linux-lvm mailing list

linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-02-02 23:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <218341039.2921649.1454281239033.JavaMail.yahoo.ref@mail.yahoo.com>
2016-01-31 23:00 ` [linux-lvm] Raid 10 - recovery after a disk failure Pavlik Kirilov
2016-01-31 23:23   ` emmanuel segura
2016-02-01 17:06     ` Pavlik Kirilov
2016-02-01 18:22       ` emmanuel segura
     [not found]         ` <1744577429.81266.1454367573843.JavaMail.yahoo@mail.yahoo.com>
2016-02-02 16:50           ` emmanuel segura
2016-02-02 23:20             ` Pavlik Kirilov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).