All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Richard Laager <rlaager@wiktel.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [RFC PATCH 14/17] block: support FALLOC_FL_PUNCH_HOLE trimming
Date: Mon, 12 Mar 2012 10:34:01 +0100	[thread overview]
Message-ID: <4F5DC309.7050709@redhat.com> (raw)
In-Reply-To: <1331325410.3715.77.camel@watermelon.coderich.net>

Il 09/03/2012 21:36, Richard Laager ha scritto:
> fallocate() only supports files. In my patch, I was fstat()ing the fd
> just after opening it and caching an is_device boolean. Then, when doing
> discards, if !is_device, call fallocate(), else (i.e. for devices) do
> the following (note: untested code):
> 
>         __uint64_t range[2];
> 
>         range[0] = sector_num << 9;
>         range[1] = (__uint64_t)nb_sectors << 9;
> 
>         if (ioctl(s->fd, BLKDISCARD, &range) && errno != EOPNOTSUPP) {
>              return -errno;
>         }
>         return 0;

You can instead put this code in a separate function hdev_discard.  hdev
device is used instead of file when the backend is a special file (block
or character).

> -------- sync requirements --------
> 
> I'm not sure if fallocate() and/or BLKDISCARD always guarantee that the
> discard has made it to stable storage. If they don't, does O_DIRECT or
> O_DSYNC on open() cause them to make any such guarantee?

O_DIRECT shouldn't have any effect; for O_DSYNC I don't think so.

> -------- Thin vs. Thick Provisioning --------
> 
> On Fri, 2012-03-09 at 00:35 -0800, Chris Wedgwood wrote:
>> Simplest still compare the blocks allocated by the file
>> to it's length (ie. stat.st_blocks != stat.st_size>>9).
> 
> I thought of this as well. It covers "99%" of the cases, but there's one
> case where it breaks down. Imagine I have a sparse file backing my
> virtual disk. In the guest, I fill the virtual disk 100%. Then, I
> restart QEMU. Now it thinks that sparse file is non-sparse and stops
> issuing hole punches.

I would not really have any problem with that, it seems like a
relatively minor case (also because newer guests will allocate the first
partition at 1 MB, so you might still have a small hole at the beginning).

> To be completely correct, I suggest the following behavior:
>      1. Add a discard boolean option to the disk layer.
>      2. If discard is not specified:
>               * For files, detect a true/false value by comparing
>                 stat.st_blocks != stat.st_size>>9.
>               * For devices, assume a fixed value (true?).
>      3. If discard is true, issue discards.
>      4. If discard is false, do not issue discards.

The problem is, who will use this interface?

Paolo

  reply	other threads:[~2012-03-12  9:34 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-08 17:15 [Qemu-devel] [RFC PATCH 00/17] Improvements around discard and write zeroes Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 01/17] qemu-iotests: add a simple test for write_zeroes Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 02/17] qed: make write-zeroes bounce buffer smaller than a single cluster Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 03/17] block: add discard properties to BlockDriverInfo Paolo Bonzini
2012-03-09 16:47   ` Kevin Wolf
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 04/17] qed: implement bdrv_aio_discard Paolo Bonzini
2012-03-09 16:31   ` Kevin Wolf
2012-03-09 17:53     ` Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 05/17] block: pass around qiov for write_zeroes operation Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations Paolo Bonzini
2012-03-09 16:37   ` Kevin Wolf
2012-03-09 18:06     ` Paolo Bonzini
2012-03-10 18:02       ` Richard Laager
2012-03-12 12:27         ` Paolo Bonzini
2012-03-12 13:04           ` Kevin Wolf
2012-03-13 19:13           ` Richard Laager
2012-03-14  7:41             ` Paolo Bonzini
2012-03-14 12:01               ` Kevin Wolf
2012-03-14 12:14                 ` Paolo Bonzini
2012-03-14 12:37                   ` Kevin Wolf
2012-03-14 12:49                     ` Paolo Bonzini
2012-03-14 13:04                       ` Kevin Wolf
2012-03-24 15:33                       ` Christoph Hellwig
2012-03-24 15:30                   ` Christoph Hellwig
2012-03-26 19:40                     ` Richard Laager
2012-03-27 10:20                       ` Kevin Wolf
2012-03-24 15:29                 ` Christoph Hellwig
2012-03-26  9:44                   ` Daniel P. Berrange
2012-03-26  9:56                     ` Christoph Hellwig
2012-03-15  0:42               ` Richard Laager
2012-03-15  9:36                 ` Paolo Bonzini
2012-03-16  0:47                   ` Richard Laager
2012-03-16  9:34                     ` Paolo Bonzini
2012-03-24 15:27         ` Christoph Hellwig
2012-03-26 19:40           ` Richard Laager
2012-03-27  9:08             ` Christoph Hellwig
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 07/17] block: make high level discard operation always zero Paolo Bonzini
2012-03-08 17:55   ` Avi Kivity
2012-03-09 16:42     ` Kevin Wolf
2012-03-12 10:42       ` Avi Kivity
2012-03-12 11:04         ` Kevin Wolf
2012-03-12 12:03           ` Avi Kivity
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 08/17] block: kill the write zeroes operation Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 09/17] ide: issue discard asynchronously but serialize the pieces Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 10/17] ide/scsi: add discard_zeroes_data property Paolo Bonzini
2012-03-08 18:13   ` Avi Kivity
2012-03-08 18:14     ` Avi Kivity
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 11/17] ide/scsi: prepare for flipping the discard defaults Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 12/17] ide/scsi: turn on discard Paolo Bonzini
2012-03-08 18:17   ` Avi Kivity
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 13/17] block: fallback from discard to writes Paolo Bonzini
2012-03-24 15:35   ` Christoph Hellwig
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 14/17] block: support FALLOC_FL_PUNCH_HOLE trimming Paolo Bonzini
2012-03-09  8:20   ` Chris Wedgwood
2012-03-09  8:31     ` Paolo Bonzini
2012-03-09  8:35       ` Chris Wedgwood
2012-03-09  8:40         ` Paolo Bonzini
2012-03-09 10:31   ` Stefan Hajnoczi
2012-03-09 10:43     ` Paolo Bonzini
2012-03-09 10:53       ` Stefan Hajnoczi
2012-03-09 10:57         ` Paolo Bonzini
2012-03-09 20:36   ` Richard Laager
2012-03-12  9:34     ` Paolo Bonzini [this message]
2012-03-24 15:40     ` Christoph Hellwig
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 15/17] raw: add get_info Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 16/17] qemu-io: fix the alloc command Paolo Bonzini
2012-03-08 17:15 ` [Qemu-devel] [RFC PATCH 17/17] raw: implement is_allocated Paolo Bonzini
2012-03-24 15:42   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F5DC309.7050709@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rlaager@wiktel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.