linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Timofey Titovets <nefelim4ag@gmail.com>,
	Adam Borowski <kilobyte@angband.pl>
Cc: Marat Khalili <mkh@rqc.ru>, Duncan <1i5t5.duncan@cox.net>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: qemu-kvm VM died during partial raid1 problems of btrfs
Date: Wed, 13 Sep 2017 08:55:24 -0400	[thread overview]
Message-ID: <b925e250-0507-55b8-c6da-47d8a5266c9f@gmail.com> (raw)
In-Reply-To: <CAGqmi74UxBRj6pPJ2aEon-raAq46R=29sef_f_wWWGQY1WpwDA@mail.gmail.com>

On 2017-09-12 20:52, Timofey Titovets wrote:
> No, no, no, no...
> No new ioctl, no change in fallocate.
> Fisrt: VM can do punch hole, if you use qemu -> qemu know how to do it.
> Windows Guest also know how to do it.
> 
> Different Hypervisor? -> google -> Make issue to support, all
> Linux/Windows/Mac OS support holes in files.
Not everybody who uses sparse files is using virtual machines.
> 
> No new code, no new strange stuff to fix not broken things.
Um, the fallocate PUNCH_HOLE mode _is_ broken.  There's a race condition 
that can trivially cause data loss.
> 
> You want replace zeroes? EXTENT_SAME can do that.
But only on a small number of filesystems, and it requires extra work 
that shouldn't be necessary.
> 
> truncate -s 4M test_hole
> dd if=/dev/zero of=./test_zero bs=4M
> 
> duperemove -vhrd ./test_hole ./test_zero
And performance for this approach is absolute shit compared to fallocate -d.

Actual numbers, using a 4G test file (which is still small for what 
you're talking about) and a 4M hole file:
fallocate -d:		0.19 user, 0.85 system, 1.26 real
duperemove -vhrd:	0.75 user, 137.70 system, 144.80 real

So, for a 4G file, it took duperemove (and the EXTENT_SAME ioctl) 114.92 
times as long to achieve the same net effect.  From a practical 
perspective, this isn't viable for regular usage just because of how 
long it takes.  Most of that overhead is that the EXTENT_SAME ioctl does 
a byte-by-byte comparison of the ranges to make sure they match, but 
that isn't strictly necessary to avoid this race condition.  All that's 
actually needed is determining if there is outstanding I/O on that 
region, and if so, some special handling prior to freezing the region is 
needed.

  reply	other threads:[~2017-09-13 12:55 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-12  8:02 qemu-kvm VM died during partial raid1 problems of btrfs Marat Khalili
2017-09-12  8:25 ` Timofey Titovets
2017-09-12  8:42   ` Marat Khalili
2017-09-12  9:21     ` Timofey Titovets
2017-09-12  9:29       ` Marat Khalili
2017-09-12  9:35         ` Timofey Titovets
2017-09-12 10:01     ` Duncan
2017-09-12 10:32       ` Adam Borowski
2017-09-12 10:39         ` Marat Khalili
2017-09-12 11:01           ` Timofey Titovets
2017-09-12 11:12             ` Adam Borowski
2017-09-12 11:17               ` Timofey Titovets
2017-09-12 11:26               ` Marat Khalili
2017-09-12 17:21                 ` Adam Borowski
2017-09-12 17:36                   ` Austin S. Hemmelgarn
2017-09-12 18:43                     ` Adam Borowski
2017-09-12 18:47                       ` Christoph Hellwig
2017-09-12 19:12                         ` Austin S. Hemmelgarn
2017-09-12 19:11                       ` Austin S. Hemmelgarn
2017-09-12 20:00                         ` Adam Borowski
2017-09-12 20:12                           ` Austin S. Hemmelgarn
2017-09-12 21:13                             ` Adam Borowski
2017-09-13  0:52                               ` Timofey Titovets
2017-09-13 12:55                                 ` Austin S. Hemmelgarn [this message]
2017-09-13 12:21                               ` Austin S. Hemmelgarn
2017-09-18 11:53                                 ` Adam Borowski
2017-09-13 14:47                               ` Martin Raiber
2017-09-13 15:25                                 ` Austin S. Hemmelgarn
2017-09-12 11:09         ` Roman Mamedov
2017-09-13 13:23 ` Chris Murphy
2017-09-13 14:15   ` Marat Khalili
2017-09-13 17:52     ` Goffredo Baroncelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b925e250-0507-55b8-c6da-47d8a5266c9f@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=kilobyte@angband.pl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mkh@rqc.ru \
    --cc=nefelim4ag@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).