All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2 0/6] block/io: avoid failure caused by misaligned BLKZEROOUT ioctl
@ 2026-01-09 12:08 Fiona Ebner
  2026-01-09 12:08 ` [PATCH 1/6] block/io: pass alignment to bdrv_init_padding() Fiona Ebner
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Fiona Ebner @ 2026-01-09 12:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, hreitz, kwolf, fam, stefanha

Previous discussion here:
https://lore.kernel.org/qemu-devel/20260105143416.737482-1-f.ebner@proxmox.com/

Commit 5634622bcb ("file-posix: allow BLKZEROOUT with -t writeback")
enables the BLKZEROOUT ioctl when using 'writeback' cache, regressing
certain 'qemu-img convert' invocations, because of a pre-existing
issue. Namely, the BLKZEROOUT ioctl might fail with errno EINVAL when
the request is shorter than the block size of the block device.

Stefan suggested prioritizing bl.pwrite_zeroes_alignment in
bdrv_co_do_zero_pwritev(). This RFC explores that approach and the
issues with qcow2 I encountered, where
bl.pwrite_zeroes_alignment = s->subcluster_size;
I would be happy to discuss potential solutions and whether we should
use this approach after all.

For example, in iotest 154 and 271, there are assertion failures,
because the padded request extends beyond the end of the image:
Assertion `offset + bytes <= bs->total_sectors * BDRV_SECTOR_SIZE ||
child->perm & BLK_PERM_RESIZE' failed.
The total image length is not necessarily aligned to the cluster size.
This could be solved by shortening the relevant requests in
bdrv_co_do_zero_pwritev() and submitting them without the
BDRV_REQ_ZERO_WRITE flag and with bl.request_alignment as the
alignment see patch 5/6.

For iotest 179, I would need to avoid clearing BDRV_REQ_ZERO_WRITE for
the head and tail parts as long as the buffer is fully zero.
Otherwise, we end up with more 'data' sectors in the target map. See
patch 6/6. With or without that, iotests 154 and 271 produces
different output (I think it might be expected, but haven't checked in
detail yet).

Another issue is exposed by iotest 177, where the (sub-)cluster size
is 1MiB, but max-transfer is only 64KiB leading to assertion failures,
because max_transfer =
QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX), align);
evaluates to 0 (because align > bs->bl.max_transfer). This could be
fixed by safeguarding doing the QEMU_ALIGN_DOWN only if the value is
bigger than align, see patch 4/6.

I'm also not sure what to do about iotest 204 and 177 which use
'opt-write-zero=15M' for the blkdebug driver (which assigns that value
to pwrite_zeroes_alignment) making an is_power_of_2(align) assertion
fail.

Yet another issue is the 'detect_zeroes' option. If the option is set,
bdrv_aligned_pwritev() might set the BDRV_REQ_ZERO_WRITE flag even if
the request is not aligned to pwrite_zeroes_alignment and the original
bug could resurface.

Best Regards,
Fiona


Fiona Ebner (6):
  block/io: pass alignment to bdrv_init_padding()
  block/io: add 'bytes' parameter to bdrv_padding_rmw_read()
  block/io: honor pwrite_zeroes_alignment in bdrv_co_do_zero_pwritev()
  block/io: safeguard max transfer calculation in bdrv_aligned_pwritev()
  block/io: handle image length not aligned to write zeroes alignment in
    bdrv_co_do_zero_pwritev()
  block/io: keep zero flag for head/tail parts of misaligned zero write
    when possible

 block/io.c | 78 ++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 55 insertions(+), 23 deletions(-)

-- 
2.47.3




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-05-28 13:27 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-09 12:08 [RFC v2 0/6] block/io: avoid failure caused by misaligned BLKZEROOUT ioctl Fiona Ebner
2026-01-09 12:08 ` [PATCH 1/6] block/io: pass alignment to bdrv_init_padding() Fiona Ebner
2026-01-09 12:08 ` [PATCH 2/6] block/io: add 'bytes' parameter to bdrv_padding_rmw_read() Fiona Ebner
2026-01-09 12:08 ` [PATCH 3/6] block/io: honor pwrite_zeroes_alignment in bdrv_co_do_zero_pwritev() Fiona Ebner
2026-01-09 12:08 ` [PATCH 4/6] block/io: safeguard max transfer calculation in bdrv_aligned_pwritev() Fiona Ebner
2026-01-19 19:34   ` Stefan Hajnoczi
2026-02-05 15:57     ` Kevin Wolf
2026-01-09 12:08 ` [PATCH 5/6] block/io: handle image length not aligned to write zeroes alignment in bdrv_co_do_zero_pwritev() Fiona Ebner
2026-01-09 12:08 ` [PATCH 6/6] block/io: keep zero flag for head/tail parts of misaligned zero write when possible Fiona Ebner
2026-02-02 22:10   ` Stefan Hajnoczi
2026-01-19 19:38 ` [RFC v2 0/6] block/io: avoid failure caused by misaligned BLKZEROOUT ioctl Stefan Hajnoczi
2026-02-02 22:16 ` Stefan Hajnoczi
2026-02-05 12:13   ` Fiona Ebner
2026-02-05 15:26     ` Stefan Hajnoczi
2026-02-05 16:02     ` Kevin Wolf
2026-05-27 21:06       ` Stefan Hajnoczi
2026-05-28  8:32         ` Fiona Ebner
2026-05-28 13:26           ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.