From: Zdenek Kabelac <zkabelac@redhat.com>
To: LVM general discussion and development <linux-lvm@redhat.com>,
Gionatan Danti <g.danti@assyoma.it>
Subject: Re: [linux-lvm] Possible bug in thin metadata size with Linux MDRAID
Date: Mon, 20 Mar 2017 10:51:55 +0100 [thread overview]
Message-ID: <7a5221fc-6c6c-8af3-91ec-66fe68b1cdde@redhat.com> (raw)
In-Reply-To: <0a3aa51f-ef59-9330-2b74-0891ebf00c51@assyoma.it>
Dne 20.3.2017 v 10:47 Gionatan Danti napsal(a):
> Hi all,
> any comments on the report below?
>
> Thanks.
Please check upstream behavior (git HEAD)
It will still take a while before final release so do not use it
regularly yet (as few things still may change).
Not sure for which other comment you look for.
Zdenek
>
> On 09/03/2017 16:33, Gionatan Danti wrote:
>> On 09/03/2017 12:53, Zdenek Kabelac wrote:
>>>
>>> Hmm - it would be interesting to see your 'metadata' - it should be
>>> still
>>> quite good fit 128M of metadata for 512G when you are not using
>>> snapshots.
>>>
>>> What's been your actual test scenario ?? (Lots of LVs??)
>>>
>>
>> Nothing unusual - I had a single thinvol with an XFS filesystem used to
>> store an HDD image gathered using ddrescue.
>>
>> Anyway, are you sure that a 128 MB metadata volume is "quite good" for a
>> 512GB volume with 128 KB chunks? My testing suggests something
>> different. For example, give it a look at this empty thinpool/thinvol:
>>
>> [root@gdanti-laptop test]# lvs -a -o +chunk_size
>> LV VG Attr LSize Pool Origin Data%
>> Meta% Move Log Cpy%Sync Convert Chunk
>> [lvol0_pmspare] vg_kvm ewi------- 128.00m
>> 0
>> thinpool vg_kvm twi-aotz-- 500.00g 0.00
>> 0.81 128.00k
>> [thinpool_tdata] vg_kvm Twi-ao---- 500.00g
>> 0
>> [thinpool_tmeta] vg_kvm ewi-ao---- 128.00m
>> 0
>> thinvol vg_kvm Vwi-a-tz-- 500.00g thinpool 0.00
>> 0
>> root vg_system -wi-ao---- 50.00g
>> 0
>> swap vg_system -wi-ao---- 3.75g
>> 0
>>
>> As you can see, as it is a empty volume, metadata is at only 0.81% Let
>> write 5 GB (1% of thin data volume):
>>
>> [root@gdanti-laptop test]# lvs -a -o +chunk_size
>> LV VG Attr LSize Pool Origin Data%
>> Meta% Move Log Cpy%Sync Convert Chunk
>> [lvol0_pmspare] vg_kvm ewi------- 128.00m
>> 0
>> thinpool vg_kvm twi-aotz-- 500.00g 1.00
>> 1.80 128.00k
>> [thinpool_tdata] vg_kvm Twi-ao---- 500.00g
>> 0
>> [thinpool_tmeta] vg_kvm ewi-ao---- 128.00m
>> 0
>> thinvol vg_kvm Vwi-a-tz-- 500.00g thinpool 1.00
>> 0
>> root vg_system -wi-ao---- 50.00g
>> 0
>> swap vg_system -wi-ao---- 3.75g
>> 0
>>
>> Metadata grown by the same 1%. Accounting for the initial 0.81
>> utilization, this means that a near full data volume (with *no*
>> overprovisionig nor snapshots) will exhaust its metadata *before* really
>> becoming 100% full.
>>
>> While I can absolutely understand that this is expected behavior when
>> using snapshots and/or overprovisioning, in this extremely simple case
>> metadata should not be exhausted before data. In other words, the
>> initial metadata creation process should be *at least* consider that a
>> plain volume can be 100% full, and allocate according.
>>
>> The interesting part is that when not using MD, all is working properly:
>> metadata are about 2x their minimal value (as reported by
>> thin_metadata_size), and this provide ample buffer for
>> snapshotting/overprovisioning. When using MD, the bad iteration between
>> RAID chunks and thin metadata chunks ends with a too small metadata volume.
>>
>> This can become very bad. Give a look at what happens when creating a
>> thin pool on a MD raid whose chunks are at 64 KB:
>>
>> [root@gdanti-laptop test]# lvs -a -o +chunk_size
>> LV VG Attr LSize Pool Origin Data% Meta%
>> Move Log Cpy%Sync Convert Chunk
>> [lvol0_pmspare] vg_kvm ewi------- 128.00m
>> 0
>> thinpool vg_kvm twi-a-tz-- 500.00g 0.00 1.58
>> 64.00k
>> [thinpool_tdata] vg_kvm Twi-ao---- 500.00g
>> 0
>> [thinpool_tmeta] vg_kvm ewi-ao---- 128.00m
>> 0
>> root vg_system -wi-ao---- 50.00g
>> 0
>> swap vg_system -wi-ao---- 3.75g
>> 0
>>
>> Thin metadata chunks are now at 64 KB - with the *same* 128 MB metadata
>> volume size. Now metadata can only address ~50% of thin volume space.
>>
>>> But as said - there is no guarantee of the size to fit for any possible
>>> use case - user is supposed to understand what kind of technology he is
>>> using,
>>> and when he 'opt-out' from automatic resize - he needs to deploy his own
>>> monitoring.
>>
>> True, but this trivial case should really works without
>> tuning/monitoring. In short, let fail gracefully on a simple case...
>>>
>>> Otherwise you would have to simply always create 16G metadata LV if you
>>> do not want to run out of metadata space.
>>>
>>>
>>
>> Absolutely true. I've written this email to report a bug, indeed ;)
>> Thank you all for this outstanding work.
>>
>
next prev parent reply other threads:[~2017-03-20 9:51 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-08 16:14 [linux-lvm] Possible bug in thin metadata size with Linux MDRAID Gionatan Danti
2017-03-08 18:55 ` Zdenek Kabelac
2017-03-09 11:24 ` Gionatan Danti
2017-03-09 11:53 ` Zdenek Kabelac
2017-03-09 15:33 ` Gionatan Danti
2017-03-20 9:47 ` Gionatan Danti
2017-03-20 9:51 ` Zdenek Kabelac [this message]
2017-03-20 10:45 ` Gionatan Danti
2017-03-20 11:01 ` Zdenek Kabelac
2017-03-20 11:52 ` Gionatan Danti
2017-03-20 13:57 ` Zdenek Kabelac
2017-03-20 14:25 ` Gionatan Danti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7a5221fc-6c6c-8af3-91ec-66fe68b1cdde@redhat.com \
--to=zkabelac@redhat.com \
--cc=g.danti@assyoma.it \
--cc=linux-lvm@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).