All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zdenek Kabelac <zkabelac@redhat.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] Testing ThinLVM metadata exhaustion
Date: Fri, 22 Apr 2016 16:04:28 +0200	[thread overview]
Message-ID: <571A2F6C.6050006@redhat.com> (raw)
In-Reply-To: <dede65b769d68f0988811f89fb10fdcb@assyoma.it>

On 22.4.2016 15:12, Gionatan Danti wrote:
> Il 18-04-2016 16:25 Gionatan Danti ha scritto:
>> Hi all,
>> I'm testing the various metadata exhaustion cases and how to cope with
>> them. Specifically, I would like to fully understand what to expect
>> after a metadata space exhaustion and the relative check/repair. To
>> such extents, metadata autoresize is disabled.
>>
>> I'm using a fully updated CentOS 6.7 x84_64 virtual machine, with a
>> virtual disk (vdb) dedicated to the thin pool / volumes. This is what
>> pvs reports:
>>
>> PV         VG          Fmt  Attr PSize  PFree
>> /dev/vda2  vg_hvmaster lvm2 a--  63.51g    0
>> /dev/vdb   vgtest      lvm2 a--  32.00g    0
>>
>> I did the following operations:
>> vgcreate vgtest /dev/vdb
>> lvcreate --thin vgtest/ThinPool -L 1G     # 4MB tmeta
>> lvchange -Zn vgtest
>> lvcreate --thin vgtest/ThinPool --name ThinVol -V 32G
>> lvresize vgtest/ThinPool -l +100%FREE # 31.99GB, 4MB tmeta, not resized
>>
>> With 64 KB chunks, the 4 MB tmeta volume is good for mapping ~8 GB, so
>> any other writes trigger a metadata space exhaustion. Then, I did:
>>
>> a) a first 8 GB write to almost fill the entire metadata space:
>> [root@hvmaster ~]# dd if=/dev/zero of=/dev/vgtest/ThinVol bs=1M count=8192
>> 8192+0 records in
>> 8192+0 records out
>> 8589934592 bytes (8.6 GB) copied, 101.059 s, 85.0 MB/s
>> [root@hvmaster ~]# lvs -a
>>   LV               VG          Attr       LSize  Pool     Origin Data%
>>  Meta%  Move Log Cpy%Sync Convert
>>   lv_root          vg_hvmaster -wi-ao---- 59.57g
>>
>>   lv_swap          vg_hvmaster -wi-ao----  3.94g
>>
>>   ThinPool         vgtest      twi-aot-M- 31.99g                 21.51 92.09
>>   [ThinPool_tdata] vgtest      Twi-ao---- 31.99g
>>
>>   [ThinPool_tmeta] vgtest      ewi-ao----  4.00m
>>
>>   ThinVol          vgtest      Vwi-a-t--- 32.00g ThinPool        23.26
>>
>>   [lvol0_pmspare]  vgtest      ewi-------  4.00m
>> [root@hvmaster ~]# thin_dump /dev/mapper/vgtest-ThinPool_tmeta
>> <superblock uuid="" time="0" transaction="1" data_block_size="128"
>> nr_data_blocks="524096">
>>   <device dev_id="1" mapped_blocks="121968" transaction="0"
>> creation_time="0" snap_time="0">
>>     <range_mapping origin_begin="0" data_begin="0" length="121968" time="0"/>
>>   </device>
>> </superblock>
>>
>> b) a second non-synched 16 GB write to totally trash the tmeta volume:
>> # Second write
>> [root@hvmaster ~]# dd if=/dev/zero of=/dev/vgtest/ThinVol bs=1M count=8192
>> 8192+0 records in
>> 8192+0 records out
>> 8589934592 bytes (8.6 GB) copied, 101.059 s, 85.0 MB/s
>> [root@hvmaster ~]# lvs -a
>>   LV               VG          Attr       LSize  Pool     Origin Data%
>>  Meta%  Move Log Cpy%Sync Convert
>>   lv_root          vg_hvmaster -wi-ao---- 59.57g
>>
>>   lv_swap          vg_hvmaster -wi-ao----  3.94g
>>
>>   ThinPool         vgtest      twi-aot-M- 31.99g                 21.51 92.09
>>   [ThinPool_tdata] vgtest      Twi-ao---- 31.99g
>>
>>   [ThinPool_tmeta] vgtest      ewi-ao----  4.00m
>>
>>   ThinVol          vgtest      Vwi-a-t--- 32.00g ThinPool        23.26
>>
>>   [lvol0_pmspare]  vgtest      ewi-------  4.00m
>> [root@hvmaster ~]# thin_dump /dev/mapper/vgtest-ThinPool_tmeta
>> <superblock uuid="" time="0" transaction="1" data_block_size="128"
>> nr_data_blocks="524096">
>>   <device dev_id="1" mapped_blocks="121968" transaction="0"
>> creation_time="0" snap_time="0">
>>     <range_mapping origin_begin="0" data_begin="0" length="121968" time="0"/>
>>   </device>
>> </superblock>
>>
>> c) a third, synched 16 GB write to see how the system behave with
>> fsync-rich filling:
>> [root@hvmaster ~]# dd if=/dev/zero of=/dev/vgtest/ThinVol bs=1M
>> count=16384 oflag=sync
>> dd: writing `/dev/vgtest/ThinVol': Input/output error
>> 7624+0 records in
>> 7623+0 records out
>> 7993294848 bytes (8.0 GB) copied, 215.808 s, 37.0 MB/s
>> [root@hvmaster ~]# lvs -a
>>   Failed to parse thin params: Error.
>>   Failed to parse thin params: Error.
>>   Failed to parse thin params: Error.
>>   Failed to parse thin params: Error.
>>   LV               VG          Attr       LSize  Pool     Origin Data%
>>  Meta%  Move Log Cpy%Sync Convert
>>   lv_root          vg_hvmaster -wi-ao---- 59.57g
>>
>>   lv_swap          vg_hvmaster -wi-ao----  3.94g
>>
>>   ThinPool         vgtest      twi-aot-M- 31.99g                 21.51 92.09
>>   [ThinPool_tdata] vgtest      Twi-ao---- 31.99g
>>
>>   [ThinPool_tmeta] vgtest      ewi-ao----  4.00m
>>
>>   ThinVol          vgtest      Vwi-a-t--- 32.00g ThinPool
>>
>>   [lvol0_pmspare]  vgtest      ewi-------  4.00m
>> [root@hvmaster ~]# thin_dump /dev/mapper/vgtest-ThinPool_tmeta
>> <superblock uuid="" time="0" transaction="1" data_block_size="128"
>> nr_data_blocks="524096">
>> metadata contains errors (run thin_check for details).
>> perhaps you wanted to run with --repair
>>
>> It is the last scenario (c) that puzzle me: rebooting the machine left
>> the thinpool inactive and inactivable (as expected), but executing
>> lvconvert --repair I can see that _all_ metadatas are gone (the pool
>> seems empty). Is that the expected behavior?
>>
>> Even more puzzling (for me) is that by skipping test a and b, and
>> going directly for c, I have a different behavior: the metadata volume
>> is (rightfully) completely filled, and the thin pool went in read-only
>> mode. Again, it that the expected behavior?
>>
>> Regards.
>
> Hi all,
> doing more tests I noticed that when "catastrophic" (non recoverable) metadata
> loss happens, dmesg logs the following lines:
>
> device-mapper: block manager: validator mismatch (old=sm_bitmap vs
> new=btree_node) for block 429
> device-mapper: space map common: unable to decrement a reference count below 0
> device-mapper: thin: 253:4: metadata operation 'dm_thin_insert_block' failed:
> error = -22
>
> During "normal" metadata exhaustion (when the pool can recover), the first two
> lines are not logged at all. Moreover, the third line reports error = -28,
> rather than error = -22 as above.
>
> I also tested the latest RHEL 7.2 and I can not reproduce the error above:
> metadata exhaustion always seems to be managed in a graceful (ie: recoverable)
> manner.
>
> I am missing something?


I assume you miss newer kernel.
There was originally this bug.

Regards

Zdenek

  reply	other threads:[~2016-04-22 14:04 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-18 14:25 [linux-lvm] Testing ThinLVM metadata exhaustion Gionatan Danti
2016-04-22 13:12 ` Gionatan Danti
2016-04-22 14:04   ` Zdenek Kabelac [this message]
2016-04-23  8:40     ` Gionatan Danti
2016-04-25  8:59       ` Gionatan Danti
2016-04-25  9:54         ` Zdenek Kabelac
2016-04-25 16:52           ` Gionatan Danti
2016-04-26  7:11           ` Gionatan Danti
2016-04-27 11:11             ` Gionatan Danti
2016-05-03 10:05               ` Gionatan Danti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=571A2F6C.6050006@redhat.com \
    --to=zkabelac@redhat.com \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.