From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <507FDA2E.8080301@redhat.com> Date: Thu, 18 Oct 2012 12:30:06 +0200 From: Zdenek Kabelac MIME-Version: 1.0 References: In-Reply-To: Content-Transfer-Encoding: 7bit Subject: Re: [linux-lvm] how to recover after thin pool metadata did fill up? Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: LVM general discussion and development Cc: Andres Toomsalu Dne 17.10.2012 22:21, Andres Toomsalu napsal(a): > Hi, > > I'm aware that thin provisioning is not yet production ready (no metadata resize) - but is there a way to recover from thin pool failure when pool metadata was filled up? > > I did setup 1.95T thin pool and after some usage pool metadata (128MB) was filling up to 99,08% - so all pool thin volumes went into read-only state. > Problem is that I cannot find a way in order to recover from this failure - eg also unable to delete/erase thin volumes and pool - only option seems to be full disk PV re-creation (eg OS re-install). > > Is there a way to recover or delete thin pool/volumes - without erasing other (normal) LVs in this Volume Group? > For example dmsetup remove didn't help. > > Some diagnostic output: > > lvs -a -o+metadata_percent > dm_report_object: report function failed for field data_percent > --- REPEATABLE MSG --- > dm_report_object: report function failed for field data_percent > LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Meta% > pool VolGroupL0 twi-i-tz 1,95t 75,28 99,08 > [pool_tdata] VolGroupL0 Twi-aot- 1,95t > [pool_tmeta] VolGroupL0 ewi-aot- 128,00m > root VolGroupL0 -wi-ao-- 10,00g > swap VolGroupL0 -wi-ao-- 16,00g > thin_backup VolGroupL0 Vwi-i-tz 700,00g pool > thin_storage VolGroupL0 Vwi---tz 900,00g pool > thin_storage-snapshot1 VolGroupL0 Vwi-i-tz 700,00g pool thin_storage > thin_storage-snapshot106 VolGroupL0 Vwi-i-tz 900,00g pool thin_storage > thin_storage-snapshot130 VolGroupL0 Vwi-i-tz 900,00g pool thin_storage > thin_storage-snapshot154 VolGroupL0 Vwi-i-tz 900,00g pool thin_storage > thin_storage-snapshot178 VolGroupL0 Vwi-i-tz 900,00g pool thin_storage > thin_storage-snapshot2 VolGroupL0 Vwi-i-tz 700,00g pool thin_storage > thin_storage-snapshot202 VolGroupL0 Vwi-i-tz 900,00g pool thin_storage > > dmsetup table > VolGroupL0-thin_storage--snapshot2: > VolGroupL0-thin_storage--snapshot178: > VolGroupL0-swap: 0 33554432 linear 8:2 41945088 > VolGroupL0-thin_storage--snapshot1: > VolGroupL0-root: 0 20971520 linear 8:2 20973568 > VolGroupL0-thin_storage--snapshot130: > VolGroupL0-pool: > VolGroupL0-thin_backup: > VolGroupL0-thin_storage--snapshot106: > VolGroupL0-thin_storage--snapshot154: > VolGroupL0-pool-tpool: 0 4194304000 thin-pool 253:2 253:3 1024 0 0 > VolGroupL0-pool_tdata: 0 2097152000 linear 8:2 75499520 > VolGroupL0-pool_tdata: 2097152000 2097152000 linear 8:2 2172913664 > VolGroupL0-pool_tmeta: 0 262144 linear 8:2 2172651520 > VolGroupL0-thin_storage--snapshot202: > > lvremove -f VolGroupL0/pool > Thin pool transaction_id=640, while expected: 643. > Unable to deactivate open VolGroupL0-pool_tdata (253:3) > Unable to deactivate open VolGroupL0-pool_tmeta (253:2) > Failed to deactivate VolGroupL0-pool-tpool > Failed to resume pool. > Failed to update thin pool pool. > Unfortunately there is no 'easy' advice for now yet - you hit current Achilles heel of thinp support in lvm2 - we are thinking how to make recovery usable for user - but it's not easy task since many things are making it very complex - so it still needs some month of work. As for now - you would need to download git source and disable certain security check directly in the source to allow activation of damaged pool. I guess I could provide you some extra patch that would allow you to active thin pool in 'read-only' mode (but it's not yet ready for upstream). Thus you should be able access 'some' data in this mode with a big note - since devices would be in read-only mode - you cannot even run fsck on such partition - and since you have transaction_id mismatch - it would need some analysis to see what actually happened - are you able to provide me archive files for the history of your recent lvm commands - it should not allow you to have difference bigger then 1 - so there is probably some bug (unless you are using some old version of lvm2 with initial thinp support) Zdenek