From: Stefan Hajnoczi <stefanha@redhat.com>
To: Sarthak Kukreti <sarthakkukreti@chromium.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	dm-devel@redhat.com, linux-block@vger.kernel.org,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Jens Axboe <axboe@kernel.dk>,
	Gwendal Grignou <gwendal@google.com>,
	Theodore Ts'o <tytso@mit.edu>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Bart Van Assche <bvanassche@google.com>,
	Mike Snitzer <snitzer@kernel.org>,
	Evan Green <evgreen@google.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Daniil Lunev <dlunev@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Alasdair Kergon <agk@redhat.com>
Subject: Re: [dm-devel] [PATCH RFC 0/8] Introduce provisioning primitives for thinly provisioned storage
Date: Mon, 19 Sep 2022 12:36:48 -0400	[thread overview]
Message-ID: <YyiaoHcueK9g5KVy@fedora> (raw)
In-Reply-To: <CAG9=OMNPnsjaUw2EUG0XFjV94-V1eD63V+1anoGM=EWKyzXEfg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4366 bytes --]
On Sat, Sep 17, 2022 at 12:46:33PM -0700, Sarthak Kukreti wrote:
> On Fri, Sep 16, 2022 at 8:03 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Thu, Sep 15, 2022 at 09:48:18AM -0700, Sarthak Kukreti wrote:
> > > From: Sarthak Kukreti <sarthakkukreti@chromium.org>
> > >
> > > Hi,
> > >
> > > This patch series is an RFC of a mechanism to pass through provision
> > > requests on stacked thinly provisioned storage devices/filesystems.
> >
> > [Reflowed text]
> >
> > > The linux kernel provides several mechanisms to set up thinly
> > > provisioned block storage abstractions (eg. dm-thin, loop devices over
> > > sparse files), either directly as block devices or backing storage for
> > > filesystems. Currently, short of writing data to either the device or
> > > filesystem, there is no way for users to pre-allocate space for use in
> > > such storage setups. Consider the following use-cases:
> > >
> > > 1) Suspend-to-disk and resume from a dm-thin device: In order to
> > > ensure that the underlying thinpool metadata is not modified during
> > > the suspend mechanism, the dm-thin device needs to be fully
> > > provisioned.
> > > 2) If a filesystem uses a loop device over a sparse file, fallocate()
> > > on the filesystem will allocate blocks for files but the underlying
> > > sparse file will remain intact.
> > > 3) Another example is virtual machine using a sparse file/dm-thin as a
> > > storage device; by default, allocations within the VM boundaries will
> > > not affect the host.
> > > 4) Several storage standards support mechanisms for thin provisioning
> > > on real hardware devices. For example:
> > >   a. The NVMe spec 1.0b section 2.1.1 loosely talks about thin
> > >   provisioning: "When the THINP bit in the NSFEAT field of the
> > >   Identify Namespace data structure is set to ‘1’, the controller ...
> > >   shall track the number of allocated blocks in the Namespace
> > >   Utilization field"
> > >   b. The SCSi Block Commands reference - 4 section references "Thin
> > >   provisioned logical units",
> > >   c. UFS 3.0 spec section 13.3.3 references "Thin provisioning".
> > >
> > > In all of the above situations, currently the only way for
> > > pre-allocating space is to issue writes (or use
> > > WRITE_ZEROES/WRITE_SAME). However, that does not scale well with
> > > larger pre-allocation sizes.
> > >
> > > This patchset introduces primitives to support block-level
> > > provisioning (note: the term 'provisioning' is used to prevent
> > > overloading the term 'allocations/pre-allocations') requests across
> > > filesystems and block devices. This allows fallocate() and file
> > > creation requests to reserve space across stacked layers of block
> > > devices and filesystems. Currently, the patchset covers a prototype on
> > > the device-mapper targets, loop device and ext4, but the same
> > > mechanism can be extended to other filesystems/block devices as well
> > > as extended for use with devices in 4 a-c.
> >
> > If you call REQ_OP_PROVISION on an unmapped LBA range of a block device
> > and then try to read the provisioned blocks, what do you get?  Zeroes?
> > Random stale disk contents?
> >
> > I think I saw elsewhere in the thread that any mapped LBAs within the
> > provisioning range are left alone (i.e. not zeroed) so I'll proceed on
> > that basis.
> >
> For block devices, I'd say it's definitely possible to get stale data, depending
> on the implementation of the allocation layer; for example, with dm-thinpool,
> the default setting via using LVM2 tools is to zero out blocks on allocation.
> But that's configurable and can be turned off to improve performance.
> 
> Similarly, for actual devices that end up supporting thin provisioning, unless
> the specification absolutely mandates that an LBA contains zeroes post
> allocation, some implementations will definitely miss out on that (probably
> similar to the semantics of discard_zeroes_data today). I'm operating under
> the assumption that it's possible to get stale data from LBAs allocated using
> provision requests at the block layer and trying to see if we can create a
> safe default operating model from that.
Please explain the semantics of REQ_OP_PROVISION in the
code/documentation in the next revision.
Thanks,
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
     prev parent reply	other threads:[~2022-09-19 16:37 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-15 16:48 [PATCH RFC 0/8] Introduce provisioning primitives for thinly provisioned storage Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 1/8] block: Introduce provisioning primitives Sarthak Kukreti
2022-09-23 15:15   ` Mike Snitzer
2022-12-29  8:17     ` Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 2/8] dm: Add support for block provisioning Sarthak Kukreti
2022-09-23 14:23   ` Mike Snitzer
2022-12-29  8:22     ` Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 3/8] virtio_blk: Add support for provision requests Sarthak Kukreti
2022-09-16  5:48   ` Stefan Hajnoczi
2022-09-20  2:33     ` Sarthak Kukreti
2022-09-27 21:37   ` Michael S. Tsirkin
2022-09-15 16:48 ` [PATCH RFC 4/8] fs: Introduce FALLOC_FL_PROVISION Sarthak Kukreti
2022-09-16 11:56   ` Brian Foster
2022-09-16 21:02     ` Sarthak Kukreti
2022-09-21 15:39       ` Brian Foster
2022-09-22  8:04         ` Sarthak Kukreti
2022-09-22 18:29           ` Brian Foster
2022-12-29  8:13             ` Sarthak Kukreti
2022-09-20  7:49   ` Christoph Hellwig
2022-09-21  5:54     ` Sarthak Kukreti
2022-09-21 15:21       ` Mike Snitzer
2022-09-22  8:08         ` Sarthak Kukreti
2022-09-23  8:45       ` Christoph Hellwig
2022-12-29  8:14         ` Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 5/8] loop: Add support for provision requests Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 6/8] ext4: Add support for FALLOC_FL_PROVISION Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 7/8] ext4: Add mount option for provisioning blocks during allocations Sarthak Kukreti
2022-09-15 16:48 ` [PATCH RFC 8/8] ext4: Add a per-file provision override xattr Sarthak Kukreti
2022-09-16  6:09 ` [PATCH RFC 0/8] Introduce provisioning primitives for thinly provisioned storage Stefan Hajnoczi
2022-09-16 18:48   ` Sarthak Kukreti
2022-09-16 20:01     ` Bart Van Assche
2022-09-16 21:59       ` Sarthak Kukreti
2022-09-20  7:46     ` Christoph Hellwig
     [not found]       ` <CAAKderPF5Z5QLxyEb80Y+90+eR0sfRmL-WfgXLp=eL=HxWSZ9g@mail.gmail.com>
2022-09-20 11:30         ` Christoph Hellwig
     [not found]           ` <CAAKderNcHpbBqWqqd5-WuKLRCQQUt7a_4D4ti4gy15+fKGK0vQ@mail.gmail.com>
2022-09-21 15:08             ` Mike Snitzer
2022-09-23  8:51             ` Christoph Hellwig
2022-09-23 14:08               ` Mike Snitzer
2022-12-29  8:17                 ` Sarthak Kukreti
2022-09-17  3:03 ` [dm-devel] " Darrick J. Wong
2022-09-17 19:46   ` Sarthak Kukreti
2022-09-19 16:36     ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=YyiaoHcueK9g5KVy@fedora \
    --to=stefanha@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@google.com \
    --cc=djwong@kernel.org \
    --cc=dlunev@google.com \
    --cc=dm-devel@redhat.com \
    --cc=evgreen@google.com \
    --cc=gwendal@google.com \
    --cc=jasowang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sarthakkukreti@chromium.org \
    --cc=snitzer@kernel.org \
    --cc=tytso@mit.edu \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).