linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] harddisk dies while pvmove is in progress
@ 2011-09-12 20:30 Robert Schöftner
  2011-09-12 21:08 ` Stuart D. Gathman
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Schöftner @ 2011-09-12 20:30 UTC (permalink / raw)
  To: linux-lvm

hi!

i recently noticed some read errors off of one of my "storage" 
harddisks. layout was something like 1*2tb + 1*1,5tb in a linear 
mapping, 1 VG (storage_1), 2 LVs (shares, homevideo). the 1,5gig 
harddisk failed.

so i ordered some disks, built a raid5 out of 4 2tb disks, and added it 
to storage_1. I pvmoved the 2tb disk to the new raid. then i tried to 
move the failed 1,5tb disk to the new raid. this process stopped when 
pvmove was at about 50%, when the drive went completely dead. server was 
rebooted.

it seems pvmove was able to move eveything of LV shares to the raid.

situation as of now:

* LVs fail to activate (needed lvchange -a y --partial)
* after activation, homevideo is completely dead, shares seems to work 
without any problems (though it generates lots of device-mapper: raid1: 
Unable to read primary mirror during recovery messages)
* percentage-counter counting, unsure if actual disk activity happens
* pvmove complains:

# pvmove
   /dev/dm-4: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
   /dev/dm-4: read failed after 0 of 4096 at 971224580096: 
Eingabe-/Ausgabefehler
   /dev/dm-4: read failed after 0 of 4096 at 971224637440: 
Eingabe-/Ausgabefehler
   /dev/dm-4: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
   /dev/dm-5: read failed after 0 of 4096 at 1500298280960: 
Eingabe-/Ausgabefehler
   /dev/dm-5: read failed after 0 of 4096 at 1500298338304: 
Eingabe-/Ausgabefehler
   Couldn't find device with uuid u1F6AW-pvCR-gM1c-O1c5-oWa0-s1Nd-oKTkf5.
   Cannot change VG storage_1 while PVs are missing.
   Consider vgreduce --removemissing.


# lvs -a -o +devices storage_1
   /dev/dm-4: read failed after 0 of 4096 at 0: Eingabe-/Ausgabefehler
   /dev/dm-4: read failed after 0 of 4096 at 971224580096: 
Eingabe-/Ausgabefehler
   /dev/dm-4: read failed after 0 of 4096 at 971224637440: 
Eingabe-/Ausgabefehler
   /dev/dm-4: read failed after 0 of 4096 at 4096: Eingabe-/Ausgabefehler
   /dev/dm-5: read failed after 0 of 4096 at 1500298280960: 
Eingabe-/Ausgabefehler
   /dev/dm-5: read failed after 0 of 4096 at 1500298338304: 
Eingabe-/Ausgabefehler
   Couldn't find device with uuid u1F6AW-pvCR-gM1c-O1c5-oWa0-s1Nd-oKTkf5.
   LV        VG        Attr   LSize Origin Snap%  Move           Log 
Copy%  Convert Devices
   homevideo storage_1 -wI-a- 
1,08t                                                 pvmove0(126141)
   homevideo storage_1 -wI-a- 
1,08t                                                 /dev/md127(553732)
   [pvmove0] storage_1 p-C-ao 1,36t               unknown device      
38,07         unknown device(0),/dev/md127(604932)
   [pvmove0] storage_1 p-C-ao 1,36t               unknown device      
38,07         unknown device(126141),/dev/md127(731073)
   shares    storage_1 -wI-a- 
2,59t                                                 /dev/md127(128000)
   shares    storage_1 -wI-a- 
2,59t                                                 pvmove0(0)
   shares    storage_1 -wI-a- 
2,59t                                                 /dev/md127(485699)
   shares    storage_1 -wI-a- 
2,59t                                                 /dev/md127(0)

# dmsetup table
storage_1-pvmove0-missing_1_0: 0 1896923136 error
storage_1-shares: 0 2930270208 linear 9:127 1048579072
storage_1-shares: 2930270208 1033347072 linear 252:5 0
storage_1-shares: 3963617280 557326336 linear 9:127 3978849280
storage_1-shares: 4520943616 1048576000 linear 9:127 3072
storage_1-homevideo: 0 1896923136 linear 252:5 1033347072
storage_1-homevideo: 1896923136 419430400 linear 9:127 4536175616

# dmsetup info -c
storage_1-pvmove0             252   5 L--w    2    2      0 
LVM-ONKgAr1yc19fIeDyMClPFjDv92vxDEAFyOZtSj8iJexF2g6PMNRff3rYytIZ5RYJ
storage_1-pvmove0-missing_1_0 252   4 L--w    1    1      0 
LVM-ONKgAr1yc19fIeDyMClPFjDv92vxDEAFyOZtSj8iJexF2g6PMNRff3rYytIZ5RYJ-missing_1_0
storage_1-shares              252   6 L--w    0    4      0 
LVM-ONKgAr1yc19fIeDyMClPFjDv92vxDEAFvbEPv7JJKoctUb2kNtGqjFrfsrHiS0v3
storage_1-homevideo           252   7 L--w    0    2      0 
LVM-ONKgAr1yc19fIeDyMClPFjDv92vxDEAFwFm52QuLABa7OKlHdYp3yEe8wxp6K07I

# dmsetup status
storage_1-pvmove0: 0 1033347072 linear
storage_1-pvmove0: 1033347072 1896923136 mirror 2 252:4 9:127 
89145/1852464 1 SA 1 core
storage_1-pvmove0-missing_1_0: 0 1896923136 error
storage_1-shares: 0 2930270208 linear
storage_1-shares: 2930270208 1033347072 linear
storage_1-shares: 3963617280 557326336 linear
storage_1-shares: 4520943616 1048576000 linear
storage_1-homevideo: 0 1896923136 linear
storage_1-homevideo: 1896923136 419430400 linear

[irrelevant VGs removed from output]

So, LV homevideo is definitely lost, but shares seems to be OK, modulo 
the "running"/interrupted pvmove. Is there a way to clean up this mess 
without creating another LV and copying the contents of shares over?

thanx
   Robert

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] harddisk dies while pvmove is in progress
  2011-09-12 20:30 [linux-lvm] harddisk dies while pvmove is in progress Robert Schöftner
@ 2011-09-12 21:08 ` Stuart D. Gathman
  2011-09-12 22:15   ` Robert Schöftner
  0 siblings, 1 reply; 6+ messages in thread
From: Stuart D. Gathman @ 2011-09-12 21:08 UTC (permalink / raw)
  To: LVM general discussion and development

On Mon, 12 Sep 2011, Robert Sch?ftner wrote:

> So, LV homevideo is definitely lost, but shares seems to be OK, modulo the 
> "running"/interrupted pvmove. Is there a way to clean up this mess without 
> creating another LV and copying the contents of shares over?

Remove the missing PV as the error message suggested.  Unless you
have some hope of resurrecting it (has had success with connecting
deceased drive via USB and putting in freezer to run it long enough
to recover a little more data), it is gone now.

--
 	      Stuart D. Gathman <stuart@bmsi.com>
     Business Management Systems Inc.  Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] harddisk dies while pvmove is in progress
  2011-09-12 21:08 ` Stuart D. Gathman
@ 2011-09-12 22:15   ` Robert Schöftner
  2011-09-14  7:23     ` Robert Schöftner
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Schöftner @ 2011-09-12 22:15 UTC (permalink / raw)
  To: linux-lvm

On 2011-09-12 23:08, Stuart D. Gathman wrote:
> On Mon, 12 Sep 2011, Robert Sch?ftner wrote:
>
>> So, LV homevideo is definitely lost, but shares seems to be OK, 
>> modulo the "running"/interrupted pvmove. Is there a way to clean up 
>> this mess without creating another LV and copying the contents of 
>> shares over?
>
> Remove the missing PV as the error message suggested.  Unless you
> have some hope of resurrecting it (has had success with connecting
> deceased drive via USB and putting in freezer to run it long enough
> to recover a little more data), it is gone now.
there is no hope in resurrecting the drive. it spins up, but doesn't 
show up, neither via direct sata connection nor via sata - usb bridge. 
and i already sent it away for warranty replacement.

my interpretation of the man page is that vgreduce --removemissing 
--force would remove both LVs, even the "good" one, that's what i want 
to avoid. my idea would be something like manually breaking the mirror 
from the raid to the missing harddisk, at least for the part that 
belongs to the LV "shares".

my plan is to edit the latest archived meta-data, remove the 
pvmove-mirror, exchange the segment pointing to the missing device with 
the mirrored segment, and vcfgrestore it, if no better idea comes up. 
the saved metadata confirms that the segment belonging to LV "shares" 
was completely mirrored before the harddisk died, so all the needed data 
is there.

thx
Robert

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] harddisk dies while pvmove is in progress
  2011-09-12 22:15   ` Robert Schöftner
@ 2011-09-14  7:23     ` Robert Schöftner
  2011-09-14 16:55       ` Ray Morris
  2011-09-14 17:25       ` Stuart D. Gathman
  0 siblings, 2 replies; 6+ messages in thread
From: Robert Schöftner @ 2011-09-14  7:23 UTC (permalink / raw)
  To: LVM general discussion and development

Am 2011-09-13 00:15, schrieb Robert Sch�ftner:
> my plan is to edit the latest archived meta-data, remove the
> pvmove-mirror, exchange the segment pointing to the missing device
> with the mirrored segment, and vcfgrestore it, if no better idea comes
> up. the saved metadata confirms that the segment belonging to LV
> "shares" was completely mirrored before the harddisk died, so all the
> needed data is there.
For the record: finally, I edited the last metadata-backup, put the
successfully mirrored segment on the raid into the LV (instead of the
pvmovesomething mirror), removed references to pvmove, the damaged PV
and the "homevideo" LV, vgcfgrestored the config, rebooted, and
everything works.

Robert

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] harddisk dies while pvmove is in progress
  2011-09-14  7:23     ` Robert Schöftner
@ 2011-09-14 16:55       ` Ray Morris
  2011-09-14 17:25       ` Stuart D. Gathman
  1 sibling, 0 replies; 6+ messages in thread
From: Ray Morris @ 2011-09-14 16:55 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: rmu

> For the record: finally, I edited the last metadata-backup, put the
> successfully mirrored segment on the raid into the LV (instead of the
> pvmovesomething mirror), removed references to pvmove, the damaged PV
> and the "homevideo" LV, vgcfgrestored the config, rebooted, and
> everything works.
> 
> Robert


It's good to know what kind of thing can actually work in practice.
I'm absent minded, the kind of guy who says "oops" a lot, so it's 
always good to know ways of recovering.
-- 
Ray Morris
support@bettercgi.com

Strongbox - The next generation in site security:
http://www.bettercgi.com/strongbox/

Throttlebox - Intelligent Bandwidth Control
http://www.bettercgi.com/throttlebox/

Strongbox / Throttlebox affiliate program:
http://www.bettercgi.com/affiliates/user/register.php




On Wed, 14 Sep 2011 09:23:22 +0200
Robert Schöftner <rmu@unfoo.net> wrote:

> Am 2011-09-13 00:15, schrieb Robert Schöftner:
> > my plan is to edit the latest archived meta-data, remove the
> > pvmove-mirror, exchange the segment pointing to the missing device
> > with the mirrored segment, and vcfgrestore it, if no better idea
> > comes up. the saved metadata confirms that the segment belonging to
> > LV "shares" was completely mirrored before the harddisk died, so
> > all the needed data is there.
> For the record: finally, I edited the last metadata-backup, put the
> successfully mirrored segment on the raid into the LV (instead of the
> pvmovesomething mirror), removed references to pvmove, the damaged PV
> and the "homevideo" LV, vgcfgrestored the config, rebooted, and
> everything works.
> 
> Robert
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] harddisk dies while pvmove is in progress
  2011-09-14  7:23     ` Robert Schöftner
  2011-09-14 16:55       ` Ray Morris
@ 2011-09-14 17:25       ` Stuart D. Gathman
  1 sibling, 0 replies; 6+ messages in thread
From: Stuart D. Gathman @ 2011-09-14 17:25 UTC (permalink / raw)
  To: LVM general discussion and development

On Wed, 14 Sep 2011, Robert Sch?ftner wrote:

> Am 2011-09-13 00:15, schrieb Robert Sch?ftner:
>> my plan is to edit the latest archived meta-data, remove the
>> pvmove-mirror, exchange the segment pointing to the missing device
>> with the mirrored segment, and vcfgrestore it, if no better idea comes
>> up. the saved metadata confirms that the segment belonging to LV
>> "shares" was completely mirrored before the harddisk died, so all the
>> needed data is there.
> For the record: finally, I edited the last metadata-backup, put the
> successfully mirrored segment on the raid into the LV (instead of the
> pvmovesomething mirror), removed references to pvmove, the damaged PV
> and the "homevideo" LV, vgcfgrestored the config, rebooted, and
> everything works.

That implies that there could be a pvmove "cleanup" utility that cleans
up a pvmove in progress that references a missing PV.

--
 	      Stuart D. Gathman <stuart@bmsi.com>
     Business Management Systems Inc.  Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-09-14 17:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-12 20:30 [linux-lvm] harddisk dies while pvmove is in progress Robert Schöftner
2011-09-12 21:08 ` Stuart D. Gathman
2011-09-12 22:15   ` Robert Schöftner
2011-09-14  7:23     ` Robert Schöftner
2011-09-14 16:55       ` Ray Morris
2011-09-14 17:25       ` Stuart D. Gathman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).