From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: qemu-devel@nongnu.org, mreitz@redhat.com
Subject: Re: [PATCH for-5.1] file-posix: Mitigate file fragmentation with extent size hints
Date: Tue, 7 Jul 2020 18:17:41 +0200 [thread overview]
Message-ID: <20200707161741.GG7002@linux.fritz.box> (raw)
In-Reply-To: <20200707142329.48303-1-kwolf@redhat.com>
Am 07.07.2020 um 16:23 hat Kevin Wolf geschrieben:
> Espeically when O_DIRECT is used with image files so that the page cache
> indirection can't cause a merge of allocating requests, the file will
> fragment on the file system layer, with a potentially very small
> fragment size (this depends on the requests the guest sent).
>
> On Linux, fragmentation can be reduced by setting an extent size hint
> when creating the file (at least on XFS, it can't be set any more after
> the first extent has been allocated), basically giving raw files a
> "cluster size" for allocation.
>
> This adds an create option to set the extent size hint, and changes the
> default from not setting a hint to setting it to 1 MB. The main reason
> why qcow2 defaults to smaller cluster sizes is that COW becomes more
> expensive, which is not an issue with raw files, so we can choose a
> larger file. The tradeoff here is only potentially wasted disk space.
>
> For qcow2 (or other image formats) over file-posix, the advantage should
> even be greater because they grow sequentially without leaving holes, so
> there won't be wasted space. Setting even larger extent size hints for
> such images may make sense. This can be done with the new option, but
> let's keep the default conservative for now.
>
> The effect is very visible with a test that intentionally creates a
> badly fragmented file with qemu-img bench (the time difference while
> creating the file is already remarkable) and then looks at the number of
> extents and the take a simple "qemu-img map" takes.
>
> Without an extent size hint:
>
> $ ./qemu-img create -f raw -o extent_size_hint=0 ~/tmp/test.raw 10G
> Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 extent_size_hint=0
> $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 0
> Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 0, step size 8192)
> Run completed in 25.848 seconds.
> $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 4096
> Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 4096, step size 8192)
> Run completed in 19.616 seconds.
> $ filefrag ~/tmp/test.raw
> /home/kwolf/tmp/test.raw: 2000000 extents found
> $ time ./qemu-img map ~/tmp/test.raw
> Offset Length Mapped to File
> 0 0x1e8480000 0 /home/kwolf/tmp/test.raw
>
> real 0m1,279s
> user 0m0,043s
> sys 0m1,226s
>
> With the new default extent size hint of 1 MB:
>
> $ ./qemu-img create -f raw -o extent_size_hint=1M ~/tmp/test.raw 10G
> Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 extent_size_hint=1048576
> $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 0
> Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 0, step size 8192)
> Run completed in 11.833 seconds.
> $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 -S 8192 -o 4096
> Sending 1000000 write requests, 4096 bytes each, 64 in parallel (starting at offset 4096, step size 8192)
> Run completed in 10.155 seconds.
> $ filefrag ~/tmp/test.raw
> /home/kwolf/tmp/test.raw: 178 extents found
> $ time ./qemu-img map ~/tmp/test.raw
> Offset Length Mapped to File
> 0 0x1e8480000 0 /home/kwolf/tmp/test.raw
>
> real 0m0,061s
> user 0m0,040s
> sys 0m0,014s
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I also need to squash in a few trivial qemu-iotests updates, for which I
won't send a v2:
diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out
index 1b0a75c8f9..0d7c5e8342 100644
--- a/tests/qemu-iotests/082.out
+++ b/tests/qemu-iotests/082.out
@@ -62,6 +62,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -86,6 +87,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -110,6 +112,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -134,6 +137,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -158,6 +162,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -182,6 +187,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -206,6 +212,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -230,6 +237,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -353,6 +361,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -377,6 +386,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -401,6 +411,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -425,6 +436,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -449,6 +461,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -473,6 +486,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -497,6 +511,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -521,6 +536,7 @@ Supported options:
encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+ extent_size_hint=<size> - Extent size hint for the image file, 0 to disable
lazy_refcounts=<bool (on/off)> - Postpone refcount updates
nocow=<bool (on/off)> - Turn off copy-on-write (valid only on btrfs)
preallocation=<str> - Preallocation mode (allowed values: off, metadata, falloc, full)
diff --git a/tests/qemu-iotests/243 b/tests/qemu-iotests/243
index a61852f6d9..17388a4644 100755
--- a/tests/qemu-iotests/243
+++ b/tests/qemu-iotests/243
@@ -51,7 +51,7 @@ for mode in off metadata falloc full; do
echo "=== preallocation=$mode ==="
echo
- _make_test_img -o "preallocation=$mode" 64M
+ _make_test_img -o "preallocation=$mode,extent_size_hint=0" 64M
printf "File size: "
du -b $TEST_IMG | cut -f1
@@ -68,7 +68,8 @@ for mode in off metadata falloc full; do
echo "=== External data file: preallocation=$mode ==="
echo
- _make_test_img -o "data_file=$TEST_IMG.data,preallocation=$mode" 64M
+ _make_test_img \
+ -o "data_file=$TEST_IMG.data,preallocation=$mode,extent_size_hint=0" 64M
echo -n "qcow2 file size: "
du -b $TEST_IMG | cut -f1
@@ -79,7 +80,7 @@ for mode in off metadata falloc full; do
echo -n "qcow2 disk usage: "
[ $(du -B1 $TEST_IMG | cut -f1) -lt 1048576 ] && echo "low" || echo "high"
echo -n "data disk usage: "
- [ $(du -B1 $TEST_IMG.data | cut -f1) -lt 1048576 ] && echo "low" || echo "high"
+ [ $(du -B1 $TEST_IMG.data | cut -f1) -lt 2097152 ] && echo "low" || echo "high"
done
diff --git a/tests/qemu-iotests/243.out b/tests/qemu-iotests/243.out
index dcb33fac32..8bd3d79d66 100644
--- a/tests/qemu-iotests/243.out
+++ b/tests/qemu-iotests/243.out
@@ -2,31 +2,31 @@ QA output created by 243
=== preallocation=off ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=off
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=off extent_size_hint=0
File size: 196616
Disk usage: low
=== preallocation=metadata ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=metadata
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=metadata extent_size_hint=0
File size: 67436544
Disk usage: low
=== preallocation=falloc ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=falloc
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=falloc extent_size_hint=0
File size: 67436544
Disk usage: high
=== preallocation=full ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=full
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 preallocation=full extent_size_hint=0
File size: 67436544
Disk usage: high
=== External data file: preallocation=off ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=off
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=off extent_size_hint=0
qcow2 file size: 196616
data file size: 67108864
qcow2 disk usage: low
@@ -34,7 +34,7 @@ data disk usage: low
=== External data file: preallocation=metadata ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=metadata
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=metadata extent_size_hint=0
qcow2 file size: 327680
data file size: 67108864
qcow2 disk usage: low
@@ -42,7 +42,7 @@ data disk usage: low
=== External data file: preallocation=falloc ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=falloc
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=falloc extent_size_hint=0
qcow2 file size: 327680
data file size: 67108864
qcow2 disk usage: low
@@ -50,7 +50,7 @@ data disk usage: high
=== External data file: preallocation=full ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=full
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 data_file=TEST_DIR/t.IMGFMT.data preallocation=full extent_size_hint=0
qcow2 file size: 327680
data file size: 67108864
qcow2 disk usage: low
next prev parent reply other threads:[~2020-07-07 16:38 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-07 14:23 [PATCH for-5.1] file-posix: Mitigate file fragmentation with extent size hints Kevin Wolf
2020-07-07 14:47 ` Eric Blake
2020-07-07 16:17 ` Kevin Wolf [this message]
2020-07-10 16:12 ` Max Reitz
2020-07-13 9:08 ` Max Reitz
2020-07-13 13:12 ` Kevin Wolf
2020-07-13 13:45 ` Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200707161741.GG7002@linux.fritz.box \
--to=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.