linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Zdenek Kabelac <zkabelac@redhat.com>
To: LVM general discussion and development <linux-lvm@redhat.com>,
	Gionatan Danti <g.danti@assyoma.it>
Cc: Xen <list@xenhideout.nl>
Subject: Re: [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM
Date: Wed, 28 Feb 2018 22:43:26 +0100	[thread overview]
Message-ID: <eea26091-40e7-8f31-e4ba-dfab059f3d70@redhat.com> (raw)
In-Reply-To: <9142007eeb745a0f4774710b7c007375@assyoma.it>

Dne 28.2.2018 v 20:07 Gionatan Danti napsal(a):
> Hi all,
> 
> Il 28-02-2018 10:26 Zdenek Kabelac ha scritto:
>> Overprovisioning on DEVICE level simply IS NOT equivalent to full
>> filesystem like you would like to see all the time here and you've
>> been already many times explained that filesystems are simply not
>> there ready - fixes are on going but it will take its time and it's
>> really pointless to exercise this on 2-3 year old kernels...
> 
> this was really beaten to death in the past months/threads. I generally agree 
> with Zedenk.
> 
> To recap (Zdeneck, correct me if I am wrong): the main problem is that, on a 
> full pool, async writes will more-or-less silenty fail (with errors shown on 
> dmesg, but nothing more). Another possible cause of problem is that, even on a 
> full pool, *some* writes will complete correctly (the one on already allocated 
> chunks).

On default - full pool starts to 'error' all 'writes' in 60 seconds.

> 
> In the past was argued that putting the entire pool in read-only mode (where 
> *all* writes fail, but read are permitted to complete) would be a better 
> fail-safe mechanism; however, it was stated that no current dmtarget permit that.

Yep - I'd probably like to see slightly different mechanism - that all
on going writes would be failing  - so far - some 'writes' will pass
(those to already provisioned areas) - some will fail (those to unprovisioned).

The main problem is - after reboot - this 'missing/unprovisioned' space may 
provide some old data...

> 
> Two (good) solution where given, both relying on scripting (see "thin_command" 
> option on lvm.conf):
> - fsfreeze on a nearly full pool (ie: >=98%);
> - replace the dmthinp target with the error target (using dmsetup).

Yep - this all can happen via 'monitoring.
The key is to do it early before disaster happens.

> I really think that with the good scripting infrastructure currently built in 
> lvm this is a more-or-less solved problem.

It still depends - there is always some sort of 'race' - unless you are 
willing to 'give-up' too early to be always sure, considering there are 
technologies that may write many GB/s...

>> Do NOT take thin snapshot of your root filesystem so you will avoid
>> thin-pool overprovisioning problem.
> 
> But is someone *really* pushing thinp for root filesystem? I always used it 

You can use rootfs with thinp - it's very fast for testing i.e. upgrades
and quickly revert back - just there should be enough free space.

> In stress testing, I never saw a system crash on a full thin pool, but I was 
> not using it on root filesystem. There are any ill effect on system stability 
> which I need to know?

Depends on version of kernel and filesystem in use.

Note RHEL/Centos kernel has lots of backport even when it's look quite old.


> The solution is to use scripting/thin_command with lvm tags. For example:
> - tag all snapshot with a "snap" tag;
> - when usage is dangerously high, drop all volumes with "snap" tag.

Yep - every user has different plans in his mind - scripting gives user 
freedom to adapt this logic to local needs...

>>> However, I don't have the space for a full copy of every filesystem, so if 
>>> I snapshot, I will automatically overprovision.

As long as admin responsible controls space in thin-pool and takes action
long time before thin-pool runs out-of-space all is fine.

If admin hopes in some kind of magic to happen - we have a problem....


>>
>> Back to rule #1 - thin-p is about 'delaying' deliverance of real space.
>> If you already have plan to never deliver promised space - you need to
>> live with consequences....
> 
> I am not sure to 100% agree on that. Thinp is not only about "delaying" space 
> provisioning; it clearly is also (mostly?) about fast, modern, usable 
> snapshots. Docker, snapper, stratis, etc. all use thinp mainly for its fast, 
> efficent snapshot capability. Denying that is not so useful and led to 
> "overwarning" (ie: when snapshotting a volume on a virtually-fillable thin pool).

Snapshot are using space - with hope that if you will 'really' need that space
you either add this space to you system - or you drop snapshots.

Still the same logic applied....

>> !SNAPSHOTS ARE NOT BACKUPS!
>>
>> This is the key problem with your thinking here (unfortunately you are
>> not 'alone' with this thinking)
> 
> Snapshot are not backups, as they do not protect from hardware problems (and 
> denying that would be lame); however, they are an invaluable *part* of a 
> successfull backup strategy. Having multiple rollaback target, even on the 
> same machine, is a very usefull tool.

Backups primarily sits on completely different storage.

If you keep backup of data in same pool:

1.)
error on this in single chunk shared by all your backup + origin - means it's 
total data loss - especially in case where filesystem are using 'BTrees' and 
some 'root node' is lost - can easily render you origin + all backups 
completely useless.

2.)
problems in thin-pool metadata can make all your origin+backups just an 
unordered mess of chunks.


> Again, I don't understand by we are speaking about system crashes. On root 
> *not* using thinp, I never saw a system crash due to full data pool. >
> Oh, and I use thinp on RHEL/CentOS only (Debian/Ubuntu backports are way too 
> limited).

Yep - this case is known to be pretty stable.

But as said - with today 'rush' of development and load of updates - user do 
want to try 'new disto upgrade' - if it works - all is fine - if it doesn't 
let's have a quick road back -  so using thin volume for rootfs is pretty 
wanted case.

Trouble is there is quite a lot of issues non-trivial to solve.

There are also some on going ideas/projects - one of them was to have thinLVs 
with priority to be always fully provisioned - so such thinLV could never be 
the one to have unprovisioned chunks....
Other was a better integration of filesystem with 'provisioned' volumes.


Zdenek

  reply	other threads:[~2018-02-28 21:43 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-06 14:31 [linux-lvm] Snapshot behavior on classic LVM vs ThinLVM Gionatan Danti
2017-04-07  8:19 ` Mark Mielke
2017-04-07  9:12   ` Gionatan Danti
2017-04-07 13:50     ` L A Walsh
2017-04-07 16:33       ` Gionatan Danti
2017-04-13 12:59         ` Stuart Gathman
2017-04-13 13:52           ` Xen
2017-04-13 14:33             ` Zdenek Kabelac
2017-04-13 14:47               ` Xen
2017-04-13 15:29               ` Stuart Gathman
2017-04-13 15:43                 ` Xen
2017-04-13 17:26                   ` Stuart D. Gathman
2017-04-13 17:32                   ` Stuart D. Gathman
2017-04-14 15:17                     ` Xen
2017-04-14  7:27               ` Gionatan Danti
2017-04-14  7:23           ` Gionatan Danti
2017-04-14 15:23             ` Xen
2017-04-14 15:53               ` Gionatan Danti
2017-04-14 16:08                 ` Stuart Gathman
2017-04-14 17:36                 ` Xen
2017-04-14 18:59                   ` Gionatan Danti
2017-04-14 19:20                     ` Xen
2017-04-15  8:27                       ` Xen
2017-04-15 23:35                         ` Xen
2017-04-17 12:33                         ` Xen
2017-04-15 21:22                     ` Xen
2017-04-15 21:49                       ` Xen
2017-04-15 21:48                     ` Xen
2017-04-18 10:17                       ` Zdenek Kabelac
2017-04-18 13:23                         ` Gionatan Danti
2017-04-18 14:32                           ` Stuart D. Gathman
2017-04-19  7:22                         ` Xen
2017-04-07 22:24     ` Mark Mielke
2017-04-08 11:56       ` Gionatan Danti
2017-04-07 18:21 ` Tomas Dalebjörk
2017-04-13 10:20 ` Gionatan Danti
2017-04-13 12:41   ` Xen
2017-04-14  7:20     ` Gionatan Danti
2017-04-14  8:24       ` Zdenek Kabelac
2017-04-14  9:07         ` Gionatan Danti
2017-04-14  9:37           ` Zdenek Kabelac
2017-04-14  9:55             ` Gionatan Danti
2017-04-22  7:14         ` Gionatan Danti
2017-04-22 16:32           ` Xen
2017-04-22 20:58             ` Gionatan Danti
2017-04-22 21:17             ` Zdenek Kabelac
2017-04-23  5:29               ` Xen
2017-04-23  9:26                 ` Zdenek Kabelac
2017-04-24 21:02                   ` Xen
2017-04-24 21:59                     ` Zdenek Kabelac
2017-04-26  7:26                       ` Gionatan Danti
2017-04-26  7:42                         ` Zdenek Kabelac
2017-04-26  8:10                           ` Gionatan Danti
2017-04-26 11:23                             ` Zdenek Kabelac
2017-04-26 13:37                               ` Gionatan Danti
2017-04-26 14:33                                 ` Zdenek Kabelac
2017-04-26 16:37                                   ` Gionatan Danti
2017-04-26 18:32                                     ` Stuart Gathman
2017-04-26 19:24                                     ` Stuart Gathman
2017-05-02 11:00                                     ` Gionatan Danti
2017-05-12 13:02                                       ` Gionatan Danti
2017-05-12 13:42                                         ` Joe Thornber
2017-05-14 20:39                                           ` Gionatan Danti
2017-05-15 12:50                                             ` Zdenek Kabelac
2017-05-15 14:48                                               ` Gionatan Danti
2017-05-15 15:33                                                 ` Zdenek Kabelac
2017-05-16  7:53                                                   ` Gionatan Danti
2017-05-16 10:54                                                     ` Zdenek Kabelac
2017-05-16 13:38                                                       ` Gionatan Danti
2018-02-27 18:39                       ` Xen
2018-02-28  9:26                         ` Zdenek Kabelac
2018-02-28 19:07                           ` Gionatan Danti
2018-02-28 21:43                             ` Zdenek Kabelac [this message]
2018-03-01  7:14                               ` Gionatan Danti
2018-03-01  8:31                                 ` Zdenek Kabelac
2018-03-01  9:43                                   ` Gianluca Cecchi
2018-03-01 11:10                                     ` Zdenek Kabelac
2018-03-01  9:52                                   ` Gionatan Danti
2018-03-01 11:23                                     ` Zdenek Kabelac
2018-03-01 12:48                                       ` Gionatan Danti
2018-03-01 16:00                                         ` Zdenek Kabelac
2018-03-01 16:26                                           ` Gionatan Danti
2018-03-03 18:32                               ` Xen
2018-03-04 20:34                                 ` Zdenek Kabelac
2018-03-03 18:17                             ` Xen
2018-03-04 20:53                               ` Zdenek Kabelac
2018-03-05  9:42                                 ` Gionatan Danti
2018-03-05 10:18                                   ` Zdenek Kabelac
2018-03-05 14:27                                     ` Gionatan Danti
2018-03-03 17:52                           ` Xen
2018-03-04 23:27                             ` Zdenek Kabelac
2017-04-22 21:22           ` Zdenek Kabelac
2017-04-24 13:49             ` Gionatan Danti
2017-04-24 14:48               ` Zdenek Kabelac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eea26091-40e7-8f31-e4ba-dfab059f3d70@redhat.com \
    --to=zkabelac@redhat.com \
    --cc=g.danti@assyoma.it \
    --cc=linux-lvm@redhat.com \
    --cc=list@xenhideout.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).