linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Daniil Lunev <dlunev@google.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Sarthak Kukreti <sarthakkukreti@chromium.org>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	dm-devel@redhat.com, linux-block@vger.kernel.org,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Jens Axboe <axboe@kernel.dk>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Alasdair Kergon <agk@redhat.com>,
	Mike Snitzer <snitzer@kernel.org>, Theodore Ts'o <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Bart Van Assche <bvanassche@google.com>,
	Evan Green <evgreen@google.com>,
	Gwendal Grignou <gwendal@google.com>
Subject: Re: [PATCH RFC 0/8] Introduce provisioning primitives for thinly provisioned storage
Date: Wed, 21 Sep 2022 11:08:43 -0400	[thread overview]
Message-ID: <Yyso+9ChDJQUf9B1@redhat.com> (raw)
In-Reply-To: <CAAKderNcHpbBqWqqd5-WuKLRCQQUt7a_4D4ti4gy15+fKGK0vQ@mail.gmail.com>

On Tue, Sep 20 2022 at  5:48P -0400,
Daniil Lunev <dlunev@google.com> wrote:

> > There is no such thing as WRITE UNAVAILABLE in NVMe.
> Apologize, that is WRITE UNCORRECTABLE. Chapter 3.2.7 of
> NVM Express NVM Command Set Specification 1.0b
> 
> > That being siad you still haven't actually explained what problem
> > you're even trying to solve.
> 
> The specific problem is the following:
> * There is an thinpool over a physical device
> * There are multiple logical volumes over the thin pool
> * Each logical volume has an independent file system and an
>   independent application running over it
> * Each application is potentially allowed to consume the entirety
>   of the disk space - there is no strict size limit for application
> * Applications need to pre-allocate space sometime, for which
>   they use fallocate. Once the operation succeeded, the application
>   assumed the space is guaranteed to be there for it.
> * Since filesystems on the volumes are independent, filesystem
>   level enforcement of size constraints is impossible and the only
>   common level is the thin pool, thus, each fallocate has to find its
>   representation in thin pool one way or another - otherwise you
>   may end up in the situation, where FS thinks it has allocated space
>   but when it tries to actually write it, the thin pool is already
>   exhausted.
> * Hole-Punching fallocate will not reach the thin pool, so the only
>   solution presently is zero-writing pre-allocate.
> * Not all storage devices support zero-writing efficiently - apart
>   from NVMe being or not being capable of doing efficient write
>   zero - changing which is easier said than done, and would take
>   years - there are also other types of storage devices that do not
>   have WRITE ZERO capability in the first place or have it in a
>   peculiar way. And adding custom WRITE ZERO to LVM would be
>   arguably a much bigger hack.
> * Thus, a provisioning block operation allows an interface specific
>   operation that guarantees the presence of the block in the
>   mapped space. LVM Thin-pool itself is the primary target for our
>   use case but the argument is that this operation maps well to
>   other interfaces which allow thinly provisioned units.

Thanks for this overview. Should help level-set others.

Adding fallocate support has been a long-standing dm-thin TODO item
for me. I just never got around to it. So thanks to Sarthak, you and
anyone else who had a hand in developing this.

I had a look at the DM thin implementation and it looks pretty simple
(doesn't require a thin-metadata change, etc).  I'll look closer at
the broader implementation (block, etc) but I'm encouraged by what I'm
seeing.

Mike


  parent reply	other threads:[~2022-09-21 15:08 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-15 16:48 [PATCH RFC 0/8] Introduce provisioning primitives for thinly provisioned storage Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 1/8] block: Introduce provisioning primitives Sarthak Kukreti
2022-09-23 15:15   ` Mike Snitzer
2022-12-29  8:17     ` Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 2/8] dm: Add support for block provisioning Sarthak Kukreti
2022-09-23 14:23   ` Mike Snitzer
2022-12-29  8:22     ` Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 3/8] virtio_blk: Add support for provision requests Sarthak Kukreti
2022-09-16  5:48   ` Stefan Hajnoczi
2022-09-20  2:33     ` Sarthak Kukreti
2022-09-27 21:37   ` Michael S. Tsirkin
2022-09-15 16:48 ` [PATCH RFC 4/8] fs: Introduce FALLOC_FL_PROVISION Sarthak Kukreti
2022-09-16 11:56   ` Brian Foster
2022-09-16 21:02     ` Sarthak Kukreti
2022-09-21 15:39       ` Brian Foster
2022-09-22  8:04         ` Sarthak Kukreti
2022-09-22 18:29           ` Brian Foster
2022-12-29  8:13             ` Sarthak Kukreti
2022-09-20  7:49   ` Christoph Hellwig
2022-09-21  5:54     ` Sarthak Kukreti
2022-09-21 15:21       ` Mike Snitzer
2022-09-22  8:08         ` Sarthak Kukreti
2022-09-23  8:45       ` Christoph Hellwig
2022-12-29  8:14         ` Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 5/8] loop: Add support for provision requests Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 6/8] ext4: Add support for FALLOC_FL_PROVISION Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 7/8] ext4: Add mount option for provisioning blocks during allocations Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 8/8] ext4: Add a per-file provision override xattr Sarthak Kukreti
2022-09-16  6:09 ` [PATCH RFC 0/8] Introduce provisioning primitives for thinly provisioned storage Stefan Hajnoczi
2022-09-16 18:48   ` Sarthak Kukreti
2022-09-16 20:01     ` Bart Van Assche
2022-09-16 21:59       ` Sarthak Kukreti
2022-09-20  7:46     ` Christoph Hellwig
     [not found]       ` <CAAKderPF5Z5QLxyEb80Y+90+eR0sfRmL-WfgXLp=eL=HxWSZ9g@mail.gmail.com>
2022-09-20 11:30         ` Christoph Hellwig
     [not found]           ` <CAAKderNcHpbBqWqqd5-WuKLRCQQUt7a_4D4ti4gy15+fKGK0vQ@mail.gmail.com>
2022-09-21 15:08             ` Mike Snitzer [this message]
2022-09-23  8:51             ` Christoph Hellwig
2022-09-23 14:08               ` Mike Snitzer
2022-12-29  8:17                 ` Sarthak Kukreti
2022-09-17  3:03 ` [dm-devel] " Darrick J. Wong
2022-09-17 19:46   ` Sarthak Kukreti
2022-09-19 16:36     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyso+9ChDJQUf9B1@redhat.com \
    --to=snitzer@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@google.com \
    --cc=dlunev@google.com \
    --cc=dm-devel@redhat.com \
    --cc=evgreen@google.com \
    --cc=gwendal@google.com \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sarthakkukreti@chromium.org \
    --cc=snitzer@kernel.org \
    --cc=stefanha@redhat.com \
    --cc=tytso@mit.edu \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).