* Block alignment of qcow2 compress driver @ 2022-01-28 11:07 Richard W.M. Jones 2022-01-28 11:39 ` Hanna Reitz 2022-02-01 14:13 ` Vladimir Sementsov-Ogievskiy 0 siblings, 2 replies; 14+ messages in thread From: Richard W.M. Jones @ 2022-01-28 11:07 UTC (permalink / raw) To: qemu-block, qemu-devel; +Cc: kwolf, andrey.shinkevich, hreitz, eblake The commands below set up a sparse RAM disk, with an allocated block at offset 32K and another one at offset 1M-32K. Then it tries to copy this to a compressed qcow2 file using qemu-nbd + the qemu compress filter: $ qemu-img create -f qcow2 output.qcow2 1M $ qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 & sleep 1 $ nbdkit -U - \ data '@32768 1*32768 @1015808 1*32768' \ --run 'nbdcopy $uri nbd://localhost -p' The nbdcopy command fails when zeroing the first 32K with: nbd://localhost: nbd_aio_zero: request is unaligned: Invalid argument This is a bug in nbdcopy because it ignores the minimum block size being correctly declared by the compress filter: $ nbdinfo nbd://localhost protocol: newstyle-fixed without TLS export="": export-size: 1048576 (1M) uri: nbd://localhost:10809/ contexts: ... block_size_minimum: 65536 <---- block_size_preferred: 65536 block_size_maximum: 33554432 The compress filter sets the minimum block size to the the same as the qcow2 cluster size here: https://gitlab.com/qemu-project/qemu/-/blob/cfe63e46be0a1f8a7fd2fd5547222f8344a43279/block/filter-compress.c#L117 I patched qemu to force this to 4K: - bs->bl.request_alignment = bdi.cluster_size; + //bs->bl.request_alignment = bdi.cluster_size; + bs->bl.request_alignment = 4096; and the copy above works, and the output file is compressed! So my question is, does the compress filter in qemu really need to declare the large minimum block size? I'm not especially concerned about efficiency, I'd prefer it just worked, and changing nbdcopy to understand block sizes is painful. Is it already adjustable at run time? (I tried using --image-opts like compress.request_alignment=4096 but it seems like the filter doesn't support anything I could think of, and I don't know how to list the supported options.) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 11:07 Block alignment of qcow2 compress driver Richard W.M. Jones @ 2022-01-28 11:39 ` Hanna Reitz 2022-01-28 11:48 ` Richard W.M. Jones 2022-01-28 11:56 ` Richard W.M. Jones 2022-02-01 14:13 ` Vladimir Sementsov-Ogievskiy 1 sibling, 2 replies; 14+ messages in thread From: Hanna Reitz @ 2022-01-28 11:39 UTC (permalink / raw) To: Richard W.M. Jones, qemu-block, qemu-devel Cc: kwolf, andrey.shinkevich, eblake On 28.01.22 12:07, Richard W.M. Jones wrote: > The commands below set up a sparse RAM disk, with an allocated block > at offset 32K and another one at offset 1M-32K. Then it tries to copy > this to a compressed qcow2 file using qemu-nbd + the qemu compress > filter: > > $ qemu-img create -f qcow2 output.qcow2 1M > $ qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 & sleep 1 > $ nbdkit -U - \ > data '@32768 1*32768 @1015808 1*32768' \ > --run 'nbdcopy $uri nbd://localhost -p' > > The nbdcopy command fails when zeroing the first 32K with: > > nbd://localhost: nbd_aio_zero: request is unaligned: Invalid argument > > This is a bug in nbdcopy because it ignores the minimum block size > being correctly declared by the compress filter: > > $ nbdinfo nbd://localhost > protocol: newstyle-fixed without TLS > export="": > export-size: 1048576 (1M) > uri: nbd://localhost:10809/ > contexts: > ... > block_size_minimum: 65536 <---- > block_size_preferred: 65536 > block_size_maximum: 33554432 > > The compress filter sets the minimum block size to the the same as the > qcow2 cluster size here: > > https://gitlab.com/qemu-project/qemu/-/blob/cfe63e46be0a1f8a7fd2fd5547222f8344a43279/block/filter-compress.c#L117 > > I patched qemu to force this to 4K: > > - bs->bl.request_alignment = bdi.cluster_size; > + //bs->bl.request_alignment = bdi.cluster_size; > + bs->bl.request_alignment = 4096; > > and the copy above works, and the output file is compressed! > > So my question is, does the compress filter in qemu really need to > declare the large minimum block size? I'm not especially concerned > about efficiency, I'd prefer it just worked, and changing nbdcopy to > understand block sizes is painful. > > Is it already adjustable at run time? (I tried using --image-opts > like compress.request_alignment=4096 but it seems like the filter > doesn't support anything I could think of, and I don't know how to > list the supported options.) I’m kind of amazed this works because the qcow2 driver rejects unaligned compressed writes[1]. And if I apply your diff, I can see that, too: $ ./qemu-img create -f qcow2 test.qcow2 64M Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=67108864 lazy_refcounts=off refcount_bits=16 $ ./qemu-io -c 'write 0 32k' --image-opts \ driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=test.qcow2 write failed: Invalid argument So I actually don’t know why it works for you. OTOH, I don’t understand why the block size affects you over NBD, because I would have expected qemu to internally auto-align requests when they are not aligned (in bdrv_co_pwritev_part()). Like, when I set the NBD block driver’s alignment to 512[2], the following still succeeds: $ ./qemu-img create -f qcow2 test.qcow2 64M Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=67108864 lazy_refcounts=off refcount_bits=16 $ ./qemu-nbd --fork --image-opts \ driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=test.qcow2 $ ./qemu-io -c 'write 0 32k' -f raw nbd://localhost wrote 32768/32768 bytes at offset 0 32 KiB, 1 ops; 00.00 sec (9.034 MiB/sec and 289.0960 ops/sec) So I wonder why your diff works on your end, and I also wonder whether we couldn’t just have the NBD server expose some custom alignment (because qemu should auto-align all requests the NBD server generates anyway). Hanna [1] https://gitlab.com/qemu-project/qemu/-/blob/master/block/qcow2.c#L4662 [2] https://gitlab.com/qemu-project/qemu/-/blob/master/block/nbd.c#L1918 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 11:39 ` Hanna Reitz @ 2022-01-28 11:48 ` Richard W.M. Jones 2022-01-28 11:57 ` Hanna Reitz 2022-01-28 11:56 ` Richard W.M. Jones 1 sibling, 1 reply; 14+ messages in thread From: Richard W.M. Jones @ 2022-01-28 11:48 UTC (permalink / raw) To: Hanna Reitz; +Cc: kwolf, andrey.shinkevich, eblake, qemu-devel, qemu-block On Fri, Jan 28, 2022 at 12:39:11PM +0100, Hanna Reitz wrote: > So I actually don’t know why it works for you. OTOH, I don’t > understand why the block size affects you over NBD, because I would > have expected qemu to internally auto-align requests when they are > not aligned (in bdrv_co_pwritev_part()). I checked it again and my hack definitely fixes nbdcopy. But maybe that's expected if qemu-nbd is auto-aligning requests? (I'm only accessing the block layer through qemu-nbd, not with qemu-io) > Like, when I set the NBD > block driver’s alignment to 512[2], the following still succeeds: Did you just patch that line in the code or is there a qemu-nbd option/image-opts to do this? Rich. > [1] https://gitlab.com/qemu-project/qemu/-/blob/master/block/qcow2.c#L4662 > [2] https://gitlab.com/qemu-project/qemu/-/blob/master/block/nbd.c#L1918 -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 11:48 ` Richard W.M. Jones @ 2022-01-28 11:57 ` Hanna Reitz 2022-01-28 12:18 ` Richard W.M. Jones 0 siblings, 1 reply; 14+ messages in thread From: Hanna Reitz @ 2022-01-28 11:57 UTC (permalink / raw) To: Richard W.M. Jones Cc: kwolf, andrey.shinkevich, eblake, qemu-devel, qemu-block On 28.01.22 12:48, Richard W.M. Jones wrote: > On Fri, Jan 28, 2022 at 12:39:11PM +0100, Hanna Reitz wrote: >> So I actually don’t know why it works for you. OTOH, I don’t >> understand why the block size affects you over NBD, because I would >> have expected qemu to internally auto-align requests when they are >> not aligned (in bdrv_co_pwritev_part()). > I checked it again and my hack definitely fixes nbdcopy. But maybe > that's expected if qemu-nbd is auto-aligning requests? (I'm only > accessing the block layer through qemu-nbd, not with qemu-io) It’s not just qemu-io, with your diff[3] I get the same EINVAL over NBD, too: $ ./qemu-img create -f qcow2 test.qcow2 64M Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=67108864 lazy_refcounts=off refcount_bits=16 $ ./qemu-nbd --fork --image-opts \ driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=test.qcow2 $ ./qemu-io -c 'write 0 32k' -f raw nbd://localhost write failed: Invalid argument >> Like, when I set the NBD >> block driver’s alignment to 512[2], the following still succeeds: > Did you just patch that line in the code or is there a qemu-nbd > option/image-opts to do this? I just changed that line of code [2], as shown in [4]. I suppose the better thing to do would be to have an option for the NBD server to force-change the announced request alignment, because it can expect the qemu block layer code to auto-align requests through RMW. Doing it in the client is wrong, because the NBD server might want to detect that the client sends unaligned requests and reject them (though ours doesn’t, it just traces such events[5] – note that it’s explicitly noted there that qemu will auto-align requests). Hanna > Rich. > >> [1] https://gitlab.com/qemu-project/qemu/-/blob/master/block/qcow2.c#L4662 >> [2] https://gitlab.com/qemu-project/qemu/-/blob/master/block/nbd.c#L1918 [3] diff --git a/block/filter-compress.c b/block/filter-compress.c index d5be538619..5a11d77231 100644 --- a/block/filter-compress.c +++ b/block/filter-compress.c @@ -114,7 +114,7 @@ static void compress_refresh_limits(BlockDriverState *bs, Error **errp) return; } - bs->bl.request_alignment = bdi.cluster_size; + bs->bl.request_alignment = 4096; } [4] diff --git a/block/nbd.c b/block/nbd.c index 63dbfa807d..8608055800 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -1915,7 +1915,7 @@ static void nbd_refresh_limits(BlockDriverState *bs, Error **errp) s->info.base_allocation) ? 1 : BDRV_SECTOR_SIZE; } - bs->bl.request_alignment = min; + bs->bl.request_alignment = 512; bs->bl.max_pdiscard = QEMU_ALIGN_DOWN(INT_MAX, min); bs->bl.max_pwrite_zeroes = max; bs->bl.max_transfer = max; [5] https://gitlab.com/qemu-project/qemu/-/blob/master/nbd/server.c#L2355 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 11:57 ` Hanna Reitz @ 2022-01-28 12:18 ` Richard W.M. Jones 2022-01-28 12:30 ` Hanna Reitz 0 siblings, 1 reply; 14+ messages in thread From: Richard W.M. Jones @ 2022-01-28 12:18 UTC (permalink / raw) To: Hanna Reitz; +Cc: kwolf, andrey.shinkevich, eblake, qemu-devel, qemu-block On Fri, Jan 28, 2022 at 12:57:47PM +0100, Hanna Reitz wrote: > On 28.01.22 12:48, Richard W.M. Jones wrote: > >On Fri, Jan 28, 2022 at 12:39:11PM +0100, Hanna Reitz wrote: > >>So I actually don’t know why it works for you. OTOH, I don’t > >>understand why the block size affects you over NBD, because I would > >>have expected qemu to internally auto-align requests when they are > >>not aligned (in bdrv_co_pwritev_part()). > >I checked it again and my hack definitely fixes nbdcopy. But maybe > >that's expected if qemu-nbd is auto-aligning requests? (I'm only > >accessing the block layer through qemu-nbd, not with qemu-io) > > It’s not just qemu-io, with your diff[3] I get the same EINVAL over > NBD, too: > > $ ./qemu-img create -f qcow2 test.qcow2 64M > Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 > extended_l2=off compression_type=zlib size=67108864 > lazy_refcounts=off refcount_bits=16 > > $ ./qemu-nbd --fork --image-opts \ > driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=test.qcow2 > > $ ./qemu-io -c 'write 0 32k' -f raw nbd://localhost > write failed: Invalid argument Strange - is that error being generated by qemu's nbd client code? Here's my test not involving qemu's client code: $ qemu-nbd --version qemu-nbd 6.2.0 (qemu-6.2.0-2.fc36) $ qemu-img create -f qcow2 output.qcow2 1M Formatting 'output.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 $ qemu-nbd --fork --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 $ nbdsh -u nbd://localhost nbd> h.get_strict_mode() 31 nbd> h.set_strict_mode(31 & ~nbd.STRICT_ALIGN) nbd> h.get_strict_mode() 15 nbd> h.pwrite(b'1'*1024, 0) nbd> exit So an unaligned 1K write works (after disabling libnbd's client-side alignment checks). > I just changed that line of code [2], as shown in [4]. I suppose > the better thing to do would be to have an option for the NBD server > to force-change the announced request alignment, because it can > expect the qemu block layer code to auto-align requests through > RMW. Doing it in the client is wrong, because the NBD server might > want to detect that the client sends unaligned requests and reject > them (though ours doesn’t, it just traces such events[5] – note that > it’s explicitly noted there that qemu will auto-align requests). I know I said I didn't care about performance (in this case), but is there in fact a penalty to sending unaligned requests to the qcow2 layer? Or perhaps it cannot compress them? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 12:18 ` Richard W.M. Jones @ 2022-01-28 12:30 ` Hanna Reitz 2022-01-28 13:19 ` Kevin Wolf 2022-01-28 13:30 ` Richard W.M. Jones 0 siblings, 2 replies; 14+ messages in thread From: Hanna Reitz @ 2022-01-28 12:30 UTC (permalink / raw) To: Richard W.M. Jones Cc: kwolf, andrey.shinkevich, eblake, qemu-devel, qemu-block On 28.01.22 13:18, Richard W.M. Jones wrote: > On Fri, Jan 28, 2022 at 12:57:47PM +0100, Hanna Reitz wrote: >> On 28.01.22 12:48, Richard W.M. Jones wrote: >>> On Fri, Jan 28, 2022 at 12:39:11PM +0100, Hanna Reitz wrote: >>>> So I actually don’t know why it works for you. OTOH, I don’t >>>> understand why the block size affects you over NBD, because I would >>>> have expected qemu to internally auto-align requests when they are >>>> not aligned (in bdrv_co_pwritev_part()). >>> I checked it again and my hack definitely fixes nbdcopy. But maybe >>> that's expected if qemu-nbd is auto-aligning requests? (I'm only >>> accessing the block layer through qemu-nbd, not with qemu-io) >> It’s not just qemu-io, with your diff[3] I get the same EINVAL over >> NBD, too: >> >> $ ./qemu-img create -f qcow2 test.qcow2 64M >> Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 >> extended_l2=off compression_type=zlib size=67108864 >> lazy_refcounts=off refcount_bits=16 >> >> $ ./qemu-nbd --fork --image-opts \ >> driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=test.qcow2 >> >> $ ./qemu-io -c 'write 0 32k' -f raw nbd://localhost >> write failed: Invalid argument > Strange - is that error being generated by qemu's nbd client code? It’s generated by qcow2, namely the exact place I pointed out (as [1]). I can see that when I put an fprintf there. > Here's my test not involving qemu's client code: > > $ qemu-nbd --version > qemu-nbd 6.2.0 (qemu-6.2.0-2.fc36) > > $ qemu-img create -f qcow2 output.qcow2 1M > Formatting 'output.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 > > $ qemu-nbd --fork --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 > > $ nbdsh -u nbd://localhost > nbd> h.get_strict_mode() > 31 > nbd> h.set_strict_mode(31 & ~nbd.STRICT_ALIGN) > nbd> h.get_strict_mode() > 15 > nbd> h.pwrite(b'1'*1024, 0) > nbd> exit > > So an unaligned 1K write works (after disabling libnbd's client-side > alignment checks). It does work when having the NBD client side simply ignore the request alignment (my diff [4]), but it doesn’t work (for me) when only reducing the compress filter’s request alignment (my diff [3]). >> I just changed that line of code [2], as shown in [4]. I suppose >> the better thing to do would be to have an option for the NBD server >> to force-change the announced request alignment, because it can >> expect the qemu block layer code to auto-align requests through >> RMW. Doing it in the client is wrong, because the NBD server might >> want to detect that the client sends unaligned requests and reject >> them (though ours doesn’t, it just traces such events[5] – note that >> it’s explicitly noted there that qemu will auto-align requests). > I know I said I didn't care about performance (in this case), but is > there in fact a penalty to sending unaligned requests to the qcow2 > layer? Or perhaps it cannot compress them? In qcow2, only the whole cluster can be compressed, so writing compressed data means having to write the whole cluster. qcow2 could implement the padding by itself, but we decided to just leave the burden of only writing full clusters (with the COMPRESSED write flag) on the callers. Things like qemu-img convert and blockdev-backup just adhere to that by design; and the compress driver makes sure to set its request alignment accordingly so that requests to it will always be aligned to the cluster size (either by its user, or by the qemu block layer which performs the padding automatically). Hanna ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 12:30 ` Hanna Reitz @ 2022-01-28 13:19 ` Kevin Wolf 2022-01-28 13:36 ` Richard W.M. Jones 2022-01-28 13:30 ` Richard W.M. Jones 1 sibling, 1 reply; 14+ messages in thread From: Kevin Wolf @ 2022-01-28 13:19 UTC (permalink / raw) To: Hanna Reitz Cc: andrey.shinkevich, eblake, Richard W.M. Jones, qemu-block, qemu-devel Am 28.01.2022 um 13:30 hat Hanna Reitz geschrieben: > > > I just changed that line of code [2], as shown in [4]. I suppose > > > the better thing to do would be to have an option for the NBD server > > > to force-change the announced request alignment, because it can > > > expect the qemu block layer code to auto-align requests through > > > RMW. Doing it in the client is wrong, because the NBD server might > > > want to detect that the client sends unaligned requests and reject > > > them (though ours doesn’t, it just traces such events[5] – note that > > > it’s explicitly noted there that qemu will auto-align requests). > > I know I said I didn't care about performance (in this case), but is > > there in fact a penalty to sending unaligned requests to the qcow2 > > layer? Or perhaps it cannot compress them? > > In qcow2, only the whole cluster can be compressed, so writing compressed > data means having to write the whole cluster. qcow2 could implement the > padding by itself, but we decided to just leave the burden of only writing > full clusters (with the COMPRESSED write flag) on the callers. > > Things like qemu-img convert and blockdev-backup just adhere to that by > design; and the compress driver makes sure to set its request alignment > accordingly so that requests to it will always be aligned to the cluster > size (either by its user, or by the qemu block layer which performs the > padding automatically). I thought the more limiting factor would be that after auto-aligning the first request by padding with zeros, the second request to the same cluster would fail because compression doesn't allow using an already allocated cluster: /* Compression can't overwrite anything. Fail if the cluster was already * allocated. */ cluster_offset = get_l2_entry(s, l2_slice, l2_index); if (cluster_offset & L2E_OFFSET_MASK) { qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice); return -EIO; } Did you always just test a single request or why don't you run into this? I guess checking L2E_OFFSET_MASK is strictly speaking wrong because it's invalid for compressed clusters (qcow2_get_cluster_type() feels more appropriate), but in practice, you will always have non-zero data there, so it should error out here. Kevin ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 13:19 ` Kevin Wolf @ 2022-01-28 13:36 ` Richard W.M. Jones 0 siblings, 0 replies; 14+ messages in thread From: Richard W.M. Jones @ 2022-01-28 13:36 UTC (permalink / raw) To: Kevin Wolf; +Cc: andrey.shinkevich, Hanna Reitz, eblake, qemu-devel, qemu-block On Fri, Jan 28, 2022 at 02:19:44PM +0100, Kevin Wolf wrote: > Am 28.01.2022 um 13:30 hat Hanna Reitz geschrieben: > > > > I just changed that line of code [2], as shown in [4]. I suppose > > > > the better thing to do would be to have an option for the NBD server > > > > to force-change the announced request alignment, because it can > > > > expect the qemu block layer code to auto-align requests through > > > > RMW. Doing it in the client is wrong, because the NBD server might > > > > want to detect that the client sends unaligned requests and reject > > > > them (though ours doesn’t, it just traces such events[5] – note that > > > > it’s explicitly noted there that qemu will auto-align requests). > > > I know I said I didn't care about performance (in this case), but is > > > there in fact a penalty to sending unaligned requests to the qcow2 > > > layer? Or perhaps it cannot compress them? > > > > In qcow2, only the whole cluster can be compressed, so writing compressed > > data means having to write the whole cluster. qcow2 could implement the > > padding by itself, but we decided to just leave the burden of only writing > > full clusters (with the COMPRESSED write flag) on the callers. > > > > Things like qemu-img convert and blockdev-backup just adhere to that by > > design; and the compress driver makes sure to set its request alignment > > accordingly so that requests to it will always be aligned to the cluster > > size (either by its user, or by the qemu block layer which performs the > > padding automatically). > > I thought the more limiting factor would be that after auto-aligning the > first request by padding with zeros, the second request to the same > cluster would fail because compression doesn't allow using an already > allocated cluster: > > /* Compression can't overwrite anything. Fail if the cluster was already > * allocated. */ > cluster_offset = get_l2_entry(s, l2_slice, l2_index); > if (cluster_offset & L2E_OFFSET_MASK) { > qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice); > return -EIO; > } > > Did you always just test a single request or why don't you run into > this? I didn't test that one specifically and yes it does fail: $ qemu-img create -f qcow2 output.qcow2 1M Formatting 'output.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 $ qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 & [1] 2069037 $ nbdsh -u nbd://localhost nbd> h.set_strict_mode(h.get_strict_mode() & ~nbd.STRICT_ALIGN) nbd> buf = b'1' * 1024 nbd> h.pwrite(buf, 0) nbd> h.pwrite(buf, 1024) Traceback (most recent call last): File "/usr/lib64/python3.10/code.py", line 90, in runcode exec(code, self.locals) File "<console>", line 1, in <module> File "/usr/lib64/python3.10/site-packages/nbd.py", line 1631, in pwrite return libnbdmod.pwrite(self._o, buf, offset, flags) nbd.Error: nbd_pwrite: write: command failed: Input/output error (EIO) So what I said in the previous email about about minimum vs preferred is wrong :-( What's more interesting is that nbdcopy still appeared to work. Simulating what that was doing would be something like which also fails when I do it directly: nbd> h.pwrite(buf, 0) nbd> h.zero(1024, 1024) Traceback (most recent call last): File "/usr/lib64/python3.10/code.py", line 90, in runcode exec(code, self.locals) File "<console>", line 1, in <module> File "/usr/lib64/python3.10/site-packages/nbd.py", line 1782, in zero return libnbdmod.zero(self._o, count, offset, flags) nbd.Error: nbd_zero: write-zeroes: command failed: Input/output error (EIO) Anyway back to poking at nbdcopy to make it support block sizes ... > I guess checking L2E_OFFSET_MASK is strictly speaking wrong because it's > invalid for compressed clusters (qcow2_get_cluster_type() feels more > appropriate), but in practice, you will always have non-zero data there, > so it should error out here. > > Kevin Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 12:30 ` Hanna Reitz 2022-01-28 13:19 ` Kevin Wolf @ 2022-01-28 13:30 ` Richard W.M. Jones 2022-01-28 13:37 ` Richard W.M. Jones 2022-01-28 21:22 ` Eric Blake 1 sibling, 2 replies; 14+ messages in thread From: Richard W.M. Jones @ 2022-01-28 13:30 UTC (permalink / raw) To: Hanna Reitz; +Cc: kwolf, andrey.shinkevich, eblake, qemu-devel, qemu-block On Fri, Jan 28, 2022 at 01:30:43PM +0100, Hanna Reitz wrote: > On 28.01.22 13:18, Richard W.M. Jones wrote: > >On Fri, Jan 28, 2022 at 12:57:47PM +0100, Hanna Reitz wrote: > >>On 28.01.22 12:48, Richard W.M. Jones wrote: > >>>On Fri, Jan 28, 2022 at 12:39:11PM +0100, Hanna Reitz wrote: > >>>>So I actually don’t know why it works for you. OTOH, I don’t > >>>>understand why the block size affects you over NBD, because I would > >>>>have expected qemu to internally auto-align requests when they are > >>>>not aligned (in bdrv_co_pwritev_part()). > >>>I checked it again and my hack definitely fixes nbdcopy. But maybe > >>>that's expected if qemu-nbd is auto-aligning requests? (I'm only > >>>accessing the block layer through qemu-nbd, not with qemu-io) > >>It’s not just qemu-io, with your diff[3] I get the same EINVAL over > >>NBD, too: > >> > >>$ ./qemu-img create -f qcow2 test.qcow2 64M > >>Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 > >>extended_l2=off compression_type=zlib size=67108864 > >>lazy_refcounts=off refcount_bits=16 > >> > >>$ ./qemu-nbd --fork --image-opts \ > >>driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=test.qcow2 > >> > >>$ ./qemu-io -c 'write 0 32k' -f raw nbd://localhost > >>write failed: Invalid argument > >Strange - is that error being generated by qemu's nbd client code? > > It’s generated by qcow2, namely the exact place I pointed out (as > [1]). I can see that when I put an fprintf there. I can't reproduce this behaviour (with qemu @ cfe63e46be0a, the head of git at time of writing). I wonder if I'm doing something wrong? ++ /home/rjones/d/qemu/build/qemu-img create -f qcow2 output.qcow2 64k Formatting 'output.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=65536 lazy_refcounts=off refcount_bits=16 ++ sleep 1 ++ /home/rjones/d/qemu/build/qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 ++ /home/rjones/d/qemu/build/qemu-io -c 'write 0 32k' -f raw nbd://localhost wrote 32768/32768 bytes at offset 0 32 KiB, 1 ops; 00.02 sec (1.547 MiB/sec and 49.5067 ops/sec) > >I know I said I didn't care about performance (in this case), but is > >there in fact a penalty to sending unaligned requests to the qcow2 > >layer? Or perhaps it cannot compress them? > > In qcow2, only the whole cluster can be compressed, so writing > compressed data means having to write the whole cluster. qcow2 > could implement the padding by itself, but we decided to just leave > the burden of only writing full clusters (with the COMPRESSED write > flag) on the callers. I feel like this may be a bug in what qemu-nbd advertises. Currently it is: $ qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 & [2] 2068900 $ nbdinfo nbd://localhost protocol: newstyle-fixed without TLS export="": export-size: 65536 (64K) uri: nbd://localhost:10809/ contexts: base:allocation is_rotational: false is_read_only: false can_cache: true can_df: true can_fast_zero: true can_flush: true can_fua: true can_multi_conn: false can_trim: true can_zero: true block_size_minimum: 65536 <--- block_size_preferred: 65536 block_size_maximum: 33554432 block_size_preferred is (rightly) set to 64K, as that's what the compress + qcow2 combination prefers. But block_size_minimum sounds as if it should be 512 or 1, if qemu-nbd is able to reassemble smaller than preferred requests, even if they are suboptimal. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 13:30 ` Richard W.M. Jones @ 2022-01-28 13:37 ` Richard W.M. Jones 2022-01-28 21:22 ` Eric Blake 1 sibling, 0 replies; 14+ messages in thread From: Richard W.M. Jones @ 2022-01-28 13:37 UTC (permalink / raw) To: Hanna Reitz; +Cc: kwolf, andrey.shinkevich, eblake, qemu-devel, qemu-block On Fri, Jan 28, 2022 at 01:30:53PM +0000, Richard W.M. Jones wrote: > I feel like this may be a bug in what qemu-nbd advertises. Currently > it is: Ignore this email, see other reply. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 13:30 ` Richard W.M. Jones 2022-01-28 13:37 ` Richard W.M. Jones @ 2022-01-28 21:22 ` Eric Blake 1 sibling, 0 replies; 14+ messages in thread From: Eric Blake @ 2022-01-28 21:22 UTC (permalink / raw) To: Richard W.M. Jones Cc: kwolf, andrey.shinkevich, Hanna Reitz, qemu-devel, qemu-block On Fri, Jan 28, 2022 at 01:30:53PM +0000, Richard W.M. Jones wrote: > > > > In qcow2, only the whole cluster can be compressed, so writing > > compressed data means having to write the whole cluster. qcow2 > > could implement the padding by itself, but we decided to just leave > > the burden of only writing full clusters (with the COMPRESSED write > > flag) on the callers. > > I feel like this may be a bug in what qemu-nbd advertises. Currently > it is: > > $ qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 & > [2] 2068900 > $ nbdinfo nbd://localhost > block_size_minimum: 65536 <--- > block_size_preferred: 65536 > block_size_maximum: 33554432 > > block_size_preferred is (rightly) set to 64K, as that's what the > compress + qcow2 combination prefers. > > But block_size_minimum sounds as if it should be 512 or 1, if qemu-nbd > is able to reassemble smaller than preferred requests, even if they > are suboptimal. When compression is involved, 64k is the minimum block size at the qcow2 layer, but the qemu NBD layer is relying on the generic block core code to do RMW on anything smaller than that. If the RMW doesn't work, we may have a bug in the block layer. Even if it does appear to work, I'm not sure whether the block layer is able to recompress a cluster - it may be that the act of RMW on a partially-written initially-compressed cluster causes that cluster to no longer be compressed, at which point, while your write succeeded, you are no longer getting any compression. So, while it is a nice QoI feature of qemu-nbd that we can rely on the block layer RMW to accept client requests that were smaller than the advertised minimum block size, I still think the advertised size is correct, and that the client is in violation of the spec if it is requesting but then not honoring the advertised size. And yes, while it is a pain to hack nbdcopy to pay more attention to block sizing, I think in the long run it will be worth it. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 11:39 ` Hanna Reitz 2022-01-28 11:48 ` Richard W.M. Jones @ 2022-01-28 11:56 ` Richard W.M. Jones 2022-01-28 21:40 ` Eric Blake 1 sibling, 1 reply; 14+ messages in thread From: Richard W.M. Jones @ 2022-01-28 11:56 UTC (permalink / raw) To: Hanna Reitz; +Cc: kwolf, andrey.shinkevich, eblake, qemu-devel, qemu-block I hacked nbdcopy to ignore block alignment (the error actually comes from libnbd refusing to send the unaligned request, not from qemu-nbd), and indeed qemu-nbd accepts the unaligned request without complaint. Eric - maybe having some flag for nbdcopy to ignore unaligned requests when we know the server doesn't care (qemu-nbd) would work? Rich. --- a/copy/nbd-ops.c +++ b/copy/nbd-ops.c @@ -59,6 +59,10 @@ open_one_nbd_handle (struct rw_nbd *rwn) exit (EXIT_FAILURE); } + uint32_t sm = nbd_get_strict_mode (nbd); + sm &= ~LIBNBD_STRICT_ALIGN; + nbd_set_strict_mode (nbd, sm); + nbd_set_debug (nbd, verbose); if (extents && rwn->d == READING && -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 11:56 ` Richard W.M. Jones @ 2022-01-28 21:40 ` Eric Blake 0 siblings, 0 replies; 14+ messages in thread From: Eric Blake @ 2022-01-28 21:40 UTC (permalink / raw) To: Richard W.M. Jones Cc: kwolf, qemu-block, qemu-devel, Hanna Reitz, andrey.shinkevich, libguestfs Adding libnbd list (libguestfs) in cc On Fri, Jan 28, 2022 at 11:56:19AM +0000, Richard W.M. Jones wrote: > > I hacked nbdcopy to ignore block alignment (the error actually comes > from libnbd refusing to send the unaligned request, not from > qemu-nbd), and indeed qemu-nbd accepts the unaligned request without > complaint. And only after I already replied to your other email, did I then see your followup recommending to read this one instead ;) The NBD spec says the client is non-complying when sending under-sized requests. If the server accepts it anyway (presumably with RMW performance pessimizations, as qemu-nbd does), that's a QoI bonus. > > Eric - maybe having some flag for nbdcopy to ignore unaligned requests > when we know the server doesn't care (qemu-nbd) would work? Yeah, that might make sense - a command-line option for stating "I know the server has a nice QoI feature, and I don't mind the performance pessimization". Another thing to consider: the way the NBD spec is written, the rules about a client sending unaligned requests being non-compliant only applies to a client that requested block size information in the first place. If the client did not request block alignment information, the server should honor anything at alignment of 512 or above (even if it would prefer a larger minimum); performance may suffer, but this is needed to cater to older clients that don't know how to request alignments - and what's more, qemu-nbd specifically has code that changes what it advertises if the client did not query (that is, an advertisement of 64k is ONLY possible if the client requested alignment details). So maybe the question becomes whether libnbd needs a knob on whether to request alignment information. Libnbd commit 9e9c74755 (libnbd v1.3.10) is where I added the code to unconditionally query for alignment info. Given that we now know of a case where NOT querying causes qemu-nbd to behave differently in what it advertises, adding such a knob to the API makes total sense, at which point, 'nbdcopy --ignore-align' becomes a way to request that the client not request alignment in the first place, rather than as a way to call nbd_set_strict_mode() to turn off alignment checking. Looks like I have some API work do propose in libnbd... > > Rich. > > --- a/copy/nbd-ops.c > +++ b/copy/nbd-ops.c > @@ -59,6 +59,10 @@ open_one_nbd_handle (struct rw_nbd *rwn) > exit (EXIT_FAILURE); > } > > + uint32_t sm = nbd_get_strict_mode (nbd); > + sm &= ~LIBNBD_STRICT_ALIGN; > + nbd_set_strict_mode (nbd, sm); > + > nbd_set_debug (nbd, verbose); > > if (extents && rwn->d == READING && > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Block alignment of qcow2 compress driver 2022-01-28 11:07 Block alignment of qcow2 compress driver Richard W.M. Jones 2022-01-28 11:39 ` Hanna Reitz @ 2022-02-01 14:13 ` Vladimir Sementsov-Ogievskiy 1 sibling, 0 replies; 14+ messages in thread From: Vladimir Sementsov-Ogievskiy @ 2022-02-01 14:13 UTC (permalink / raw) To: Richard W.M. Jones, qemu-block, qemu-devel Cc: kwolf, andrey.shinkevich, hreitz, eblake 28.01.2022 14:07, Richard W.M. Jones wrote: > The commands below set up a sparse RAM disk, with an allocated block > at offset 32K and another one at offset 1M-32K. Then it tries to copy > this to a compressed qcow2 file using qemu-nbd + the qemu compress > filter: > > $ qemu-img create -f qcow2 output.qcow2 1M > $ qemu-nbd -t --image-opts driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2 & sleep 1 > $ nbdkit -U - \ > data '@32768 1*32768 @1015808 1*32768' \ > --run 'nbdcopy $uri nbd://localhost -p' > > The nbdcopy command fails when zeroing the first 32K with: > > nbd://localhost: nbd_aio_zero: request is unaligned: Invalid argument > > This is a bug in nbdcopy because it ignores the minimum block size > being correctly declared by the compress filter: > > $ nbdinfo nbd://localhost > protocol: newstyle-fixed without TLS > export="": > export-size: 1048576 (1M) > uri: nbd://localhost:10809/ > contexts: > ... > block_size_minimum: 65536 <---- > block_size_preferred: 65536 > block_size_maximum: 33554432 > > The compress filter sets the minimum block size to the the same as the > qcow2 cluster size here: > > https://gitlab.com/qemu-project/qemu/-/blob/cfe63e46be0a1f8a7fd2fd5547222f8344a43279/block/filter-compress.c#L117 > > I patched qemu to force this to 4K: > > - bs->bl.request_alignment = bdi.cluster_size; > + //bs->bl.request_alignment = bdi.cluster_size; > + bs->bl.request_alignment = 4096; > > and the copy above works, and the output file is compressed! > > So my question is, does the compress filter in qemu really need to > declare the large minimum block size? I'm not especially concerned > about efficiency, I'd prefer it just worked, and changing nbdcopy to > understand block sizes is painful. > > Is it already adjustable at run time? (I tried using --image-opts > like compress.request_alignment=4096 but it seems like the filter > doesn't support anything I could think of, and I don't know how to > list the supported options.) > Hi! I didn't read the whole thread, so in case it was not mentioned: There is a limitation about compressed writes in qcow2 driver in Qemu: you can't do compressed write to the same cluster twice, the second write will fail. So, when we do some copying (or backup) and write cluster by cluster (or at least cluster-aligend) everything works. If you write partial cluster (with compress filter) it may work due to automatic RMW, but when you than try to write second part of same cluster it will fail. -- Best regards, Vladimir ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-02-01 16:09 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-01-28 11:07 Block alignment of qcow2 compress driver Richard W.M. Jones 2022-01-28 11:39 ` Hanna Reitz 2022-01-28 11:48 ` Richard W.M. Jones 2022-01-28 11:57 ` Hanna Reitz 2022-01-28 12:18 ` Richard W.M. Jones 2022-01-28 12:30 ` Hanna Reitz 2022-01-28 13:19 ` Kevin Wolf 2022-01-28 13:36 ` Richard W.M. Jones 2022-01-28 13:30 ` Richard W.M. Jones 2022-01-28 13:37 ` Richard W.M. Jones 2022-01-28 21:22 ` Eric Blake 2022-01-28 11:56 ` Richard W.M. Jones 2022-01-28 21:40 ` Eric Blake 2022-02-01 14:13 ` Vladimir Sementsov-Ogievskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).