Re: [linux-lvm] Unexptected filesytem unmount with thin provision and autoextend disabled - lvmetad crashed?

linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed

From: matthew patton <pattonme@yahoo.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Subject: Re: [linux-lvm] Unexptected filesytem unmount with thin provision and autoextend disabled - lvmetad crashed?
Date: Wed, 18 May 2016 04:57:23 +0000 (UTC)	[thread overview]
Message-ID: <1872684910.4114972.1463547443287.JavaMail.yahoo@mail.yahoo.com> (raw)
In-Reply-To: 1872684910.4114972.1463547443287.JavaMail.yahoo.ref@mail.yahoo.com

Xen wrote:

<quote> So there are two different cases as mentioned: existing block writes, 
 and new block writes. What I was gabbing about earlier would be forcing 
 a filesystem to also be able to distuinguish between them. You would 
 have a filesystem-level "no extend" mode or "no allocate" mode that gets 
 triggered. Initially my thought was to have this get triggered trough 
 the FS-LVM interface. But, it could also be made operational not through 
 any membrane but simply by having a kernel (module) that gets passed 
 this information. In both cases the idea is to say: the filesystem can 
 do what it wants with existing blocks, but it cannot get new ones.
</quote>

You still have no earthly clue how the various layers work, apparently. For the FS to "know" which of it's blocks can be scribbled on and which can't means it has to constantly poll the block layer (the next layer down may NOT necessarily be LVM) on every write. Goodbye performance.

<quote>
 However, it does mean the filesystem must know the 'hidden geometry' 
 beneath its own blocks, so that it can know about stuff that won't work 
 anymore.
</quote>

I'm pretty sure this was explained to you a couple weeks ago: it's called "integration". For 50 years filesystems were DELIBERATELY written to be agnostic if not outright ignorant of the underlying block device's peculiarities. That's how modular software is written. Sure, some optimizations have been made by peaking into attributes exposed by the block layer but those attributes don't change over time. They are probed at newfs() time and never consulted again.

Chafing at the inherent tradeoffs caused by "lack of knowledge" was why BTRFS and ZFS were written. It is  ignorant to keep pounding the "but I want XFS/EXT+LVM to be feature parity with BTRFS". It's not supposed to, it was never intended and it will never happen. So go use the tool as it's designed or go use something else that tickles your fancy.

<quote>
 Will mention that I still haven't tested --errorwhenfull yet.
</quote>

But you conveniently overlook the fact that the FS is NOT remotely full using any of the standard tools - all of a sudden the FS got signaled that the block layer was denying write BIO calls. Maybe there's a helpful kern.err in syslog that you wrote support for? 

<quote>
 In principle if you had the means to acquire such a  flag/state/condition, and the
 filesystem would be able to block new  allocation wherever whenever, you would already
 have a working system.  So what is then non-trivial?
...
 It seems completely obvious that to me at this point, if anything from 
 LVM (or e.g. dmeventd) could signal every filesystem on every affected
 thin volume, to enter a do-not-allocate state, and filesystems would be 
 able to fail writes based on that, you would already have a solution
</quote>

And so therefore in order to acquire this "signal" every write has to be done in synchronous fashion and making sure strict data integrity is maintained vis-a-vis filesystem data and metadata. Tweaking kernel dirty block size and flush intervals are knobs that you can be turned to "signal" user-land that write errors are happening. There's no such thing as "immediate" unless you use synchronous function calls from userland.

If you want to write your application to handle "mis-behaved" block layers, then use O-DIRECT+SYNC.

next      parent reply	other threads:[~2016-05-18  5:00 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1872684910.4114972.1463547443287.JavaMail.yahoo.ref@mail.yahoo.com>
2016-05-18  4:57 ` matthew patton [this message]
2016-05-18 14:20   ` [linux-lvm] Unexptected filesytem unmount with thin provision and autoextend disabled - lvmetad crashed? Xen
     [not found] <766997897.3926921.1463545271031.JavaMail.yahoo.ref@mail.yahoo.com>
2016-05-18  4:21 ` matthew patton
2016-05-15 10:33 Gionatan Danti
2016-05-16 12:08 ` Zdenek Kabelac
2016-05-16 13:01   ` Xen
2016-05-16 14:09     ` Zdenek Kabelac
2016-05-16 19:25       ` Xen
2016-05-16 21:39         ` Xen
2016-05-17  9:43         ` Zdenek Kabelac
2016-05-17 17:17           ` Xen
2016-05-17 19:18             ` Zdenek Kabelac
2016-05-17 20:43               ` Xen
2016-05-17 22:26                 ` Zdenek Kabelac
2016-05-18  1:34                   ` Xen
2016-05-18 12:15                     ` Zdenek Kabelac
2016-05-17 13:09   ` Gionatan Danti
2016-05-17 13:48     ` Zdenek Kabelac
2016-05-18 13:47       ` Gionatan Danti
2016-05-24 13:45         ` Gionatan Danti
2016-05-24 14:17           ` Zdenek Kabelac
2016-05-24 14:28             ` Gionatan Danti
2016-05-24 17:17               ` Zdenek Kabelac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1872684910.4114972.1463547443287.JavaMail.yahoo@mail.yahoo.com \
    --to=pattonme@yahoo.com \
    --cc=linux-lvm@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).