* [PATCH v2 0/3] block: use pwrite_zeroes_alignment when writing first sector
@ 2025-10-07 14:16 Stefan Hajnoczi
2025-10-07 14:16 ` [PATCH v2 1/3] file-posix: populate pwrite_zeroes_alignment Stefan Hajnoczi
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 14:16 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Kevin Wolf, Jean-Louis Dupond,
Vladimir Sementsov-Ogievskiy, Hanna Reitz, Stefan Hajnoczi
v2:
- Simplify condition to if (!s->needs_alignment) in patch 1 [Vladimir]
This series fixes a bug I introduced in commit 5634622bcb33 ("file-posix: allow
BLKZEROOUT with -t writeback"). The Linux fallocate(2) and ioctl(BLKZEROOUT)
syscalls require logical block size alignment of the offset and length, even
when the file is opened in buffered I/O mode where read/write operations do not
require alignment.
The fix is to populate the pwrite_zeroes_alignment block limits field and to
use that limit in create_file_fallback_zero_first_sector().
One issue I want to raise is that pwrite_zeroes_alignment is an "optimal
alignment" hint. Hence create_file_fallback_zero_first_sector() had to be
modified to honor the limit explicitly. The block layer doesn't automatically
apply padding in order to align requests. This is different from how QEMU's
block layer pwrite/pread works, where it does automatically apply padding and
read/modify/write as necessary. If you want consistency, please let me know.
Stefan Hajnoczi (3):
file-posix: populate pwrite_zeroes_alignment
block: use pwrite_zeroes_alignment when writing first sector
iotests: add Linux loop device image creation test
include/system/block-backend-io.h | 1 +
block.c | 3 +-
block/block-backend.c | 11 ++++
block/file-posix.c | 16 +++++
tests/qemu-iotests/tests/loop-create-file | 59 +++++++++++++++++++
tests/qemu-iotests/tests/loop-create-file.out | 8 +++
6 files changed, 97 insertions(+), 1 deletion(-)
create mode 100755 tests/qemu-iotests/tests/loop-create-file
create mode 100644 tests/qemu-iotests/tests/loop-create-file.out
--
2.51.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/3] file-posix: populate pwrite_zeroes_alignment
2025-10-07 14:16 [PATCH v2 0/3] block: use pwrite_zeroes_alignment when writing first sector Stefan Hajnoczi
@ 2025-10-07 14:16 ` Stefan Hajnoczi
2025-10-07 15:13 ` Vladimir Sementsov-Ogievskiy
2025-10-07 14:16 ` [PATCH v2 2/3] block: use pwrite_zeroes_alignment when writing first sector Stefan Hajnoczi
2025-10-07 14:17 ` [PATCH v2 3/3] iotests: add Linux loop device image creation test Stefan Hajnoczi
2 siblings, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 14:16 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Kevin Wolf, Jean-Louis Dupond,
Vladimir Sementsov-Ogievskiy, Hanna Reitz, Stefan Hajnoczi
Linux block devices require write zeroes alignment whereas files do not.
It may come as a surprise that block devices opened in buffered I/O mode
require the alignment for write zeroes requests although normal
read/write requests do not.
Therefore it is necessary to populate the pwrite_zeroes_alignment field.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/file-posix.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/block/file-posix.c b/block/file-posix.c
index 8c738674ce..827ffa77a5 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1602,6 +1602,22 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
bs->bl.pdiscard_alignment = dalign;
}
+
+#ifdef __linux__
+ /*
+ * Linux requires logical block size alignment for write zeroes even
+ * when normal reads/writes do not require alignment.
+ */
+ if (!s->needs_alignment) {
+ ret = probe_logical_blocksize(s->fd,
+ &bs->bl.pwrite_zeroes_alignment);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret,
+ "Failed to probe logical block size");
+ return;
+ }
+ }
+#endif /* __linux__ */
}
raw_refresh_zoned_limits(bs, &st, errp);
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/3] block: use pwrite_zeroes_alignment when writing first sector
2025-10-07 14:16 [PATCH v2 0/3] block: use pwrite_zeroes_alignment when writing first sector Stefan Hajnoczi
2025-10-07 14:16 ` [PATCH v2 1/3] file-posix: populate pwrite_zeroes_alignment Stefan Hajnoczi
@ 2025-10-07 14:16 ` Stefan Hajnoczi
2025-10-07 14:17 ` [PATCH v2 3/3] iotests: add Linux loop device image creation test Stefan Hajnoczi
2 siblings, 0 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 14:16 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Kevin Wolf, Jean-Louis Dupond,
Vladimir Sementsov-Ogievskiy, Hanna Reitz, Stefan Hajnoczi
Since commit 5634622bcb33 ("file-posix: allow BLKZEROOUT with -t
writeback"), qemu-img create errors out on a Linux loop block device
with a 4 KB sector size:
# dd if=/dev/zero of=blockfile bs=1M count=1024
# losetup --sector-size 4096 /dev/loop0 blockfile
# qemu-img create -f raw /dev/loop0 1G
Formatting '/dev/loop0', fmt=raw size=1073741824
qemu-img: /dev/loop0: Failed to clear the new image's first sector: Invalid argument
Use the pwrite_zeroes_alignment block limit to avoid misaligned
fallocate(2) or ioctl(BLKZEROOUT) in the block/file-posix.c block
driver.
Fixes: 5634622bcb33 ("file-posix: allow BLKZEROOUT with -t writeback")
Reported-by: Jean-Louis Dupond <jean-louis@dupond.be>
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/3127
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
include/system/block-backend-io.h | 1 +
block.c | 3 ++-
block/block-backend.c | 11 +++++++++++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/include/system/block-backend-io.h b/include/system/block-backend-io.h
index ba8dfcc7d0..6d5ac476fc 100644
--- a/include/system/block-backend-io.h
+++ b/include/system/block-backend-io.h
@@ -116,6 +116,7 @@ BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
void *opaque, int ret);
uint32_t blk_get_request_alignment(BlockBackend *blk);
+uint32_t blk_get_pwrite_zeroes_alignment(BlockBackend *blk);
uint32_t blk_get_max_transfer(BlockBackend *blk);
uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
diff --git a/block.c b/block.c
index 8848e9a7ed..be77e03904 100644
--- a/block.c
+++ b/block.c
@@ -606,12 +606,13 @@ create_file_fallback_zero_first_sector(BlockBackend *blk,
int64_t current_size,
Error **errp)
{
+ uint32_t alignment = blk_get_pwrite_zeroes_alignment(blk);
int64_t bytes_to_clear;
int ret;
GLOBAL_STATE_CODE();
- bytes_to_clear = MIN(current_size, BDRV_SECTOR_SIZE);
+ bytes_to_clear = MIN(current_size, MAX(BDRV_SECTOR_SIZE, alignment));
if (bytes_to_clear) {
ret = blk_co_pwrite_zeroes(blk, 0, bytes_to_clear, BDRV_REQ_MAY_UNMAP);
if (ret < 0) {
diff --git a/block/block-backend.c b/block/block-backend.c
index f8d6ba65c1..239d6eca37 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2305,6 +2305,17 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
}
+/* Returns the optimal write zeroes alignment, in bytes; guaranteed nonzero */
+uint32_t blk_get_pwrite_zeroes_alignment(BlockBackend *blk)
+{
+ BlockDriverState *bs = blk_bs(blk);
+ IO_CODE();
+ if (!bs) {
+ return BDRV_SECTOR_SIZE;
+ }
+ return bs->bl.pwrite_zeroes_alignment ?: bs->bl.request_alignment;
+}
+
/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
{
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 3/3] iotests: add Linux loop device image creation test
2025-10-07 14:16 [PATCH v2 0/3] block: use pwrite_zeroes_alignment when writing first sector Stefan Hajnoczi
2025-10-07 14:16 ` [PATCH v2 1/3] file-posix: populate pwrite_zeroes_alignment Stefan Hajnoczi
2025-10-07 14:16 ` [PATCH v2 2/3] block: use pwrite_zeroes_alignment when writing first sector Stefan Hajnoczi
@ 2025-10-07 14:17 ` Stefan Hajnoczi
2025-10-07 15:48 ` Vladimir Sementsov-Ogievskiy
2 siblings, 1 reply; 6+ messages in thread
From: Stefan Hajnoczi @ 2025-10-07 14:17 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Kevin Wolf, Jean-Louis Dupond,
Vladimir Sementsov-Ogievskiy, Hanna Reitz, Stefan Hajnoczi
This qemu-iotests test case is based on the reproducer that Jean-Louis
Dupond <jean-louis@dupond.be> shared in
https://gitlab.com/qemu-project/qemu/-/issues/3127.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
tests/qemu-iotests/tests/loop-create-file | 59 +++++++++++++++++++
tests/qemu-iotests/tests/loop-create-file.out | 8 +++
2 files changed, 67 insertions(+)
create mode 100755 tests/qemu-iotests/tests/loop-create-file
create mode 100644 tests/qemu-iotests/tests/loop-create-file.out
diff --git a/tests/qemu-iotests/tests/loop-create-file b/tests/qemu-iotests/tests/loop-create-file
new file mode 100755
index 0000000000..5ec75b046b
--- /dev/null
+++ b/tests/qemu-iotests/tests/loop-create-file
@@ -0,0 +1,59 @@
+#!/usr/bin/env bash
+# group: quick
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Copyright Red Hat, Inc.
+#
+# Test Linux loop device image creation
+#
+# This test verifies #3127 "qemu-img create fails on loop device with sector size 4096"
+# https://gitlab.com/qemu-project/qemu/-/issues/3127
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+status=1 # failure is the default!
+
+_cleanup() {
+ if [ -n "$loopdev" ]; then
+ sudo losetup --detach "$loopdev"
+ fi
+
+ _cleanup_test_img
+}
+
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+cd ..
+. ./common.rc
+. ./common.filter
+
+_supported_fmt raw
+_supported_proto file
+_supported_os Linux
+
+if ! sudo -n losetup &>/dev/null; then
+ _notrun "sudo losetup not available"
+fi
+
+echo
+echo "=== Create image on a 4 KB sector size loop device ==="
+echo
+
+_make_test_img -f $IMGFMT 1M
+
+loopdev=$(sudo losetup --sector-size 4096 --find --show "$TEST_IMG")
+if [ -z "$loopdev" ]; then
+ _fail
+fi
+
+sudo $QEMU_IMG_PROG create -f raw "$loopdev" 1M | \
+ sed -e "s#/dev/loop[0-9]\\+#LOOPDEV#g"
+
+# success, all done
+echo
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/tests/loop-create-file.out b/tests/qemu-iotests/tests/loop-create-file.out
new file mode 100644
index 0000000000..32d4155695
--- /dev/null
+++ b/tests/qemu-iotests/tests/loop-create-file.out
@@ -0,0 +1,8 @@
+QA output created by loop-create-file
+
+=== Create image on a 4 KB sector size loop device ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+Formatting 'LOOPDEV', fmt=raw size=1048576
+
+*** done
--
2.51.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] file-posix: populate pwrite_zeroes_alignment
2025-10-07 14:16 ` [PATCH v2 1/3] file-posix: populate pwrite_zeroes_alignment Stefan Hajnoczi
@ 2025-10-07 15:13 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-07 15:13 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: qemu-block, Kevin Wolf, Jean-Louis Dupond, Hanna Reitz
On 07.10.25 17:16, Stefan Hajnoczi wrote:
> Linux block devices require write zeroes alignment whereas files do not.
>
> It may come as a surprise that block devices opened in buffered I/O mode
> require the alignment for write zeroes requests although normal
> read/write requests do not.
>
> Therefore it is necessary to populate the pwrite_zeroes_alignment field.
>
> Signed-off-by: Stefan Hajnoczi<stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 3/3] iotests: add Linux loop device image creation test
2025-10-07 14:17 ` [PATCH v2 3/3] iotests: add Linux loop device image creation test Stefan Hajnoczi
@ 2025-10-07 15:48 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 6+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2025-10-07 15:48 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: qemu-block, Kevin Wolf, Jean-Louis Dupond, Hanna Reitz
On 07.10.25 17:17, Stefan Hajnoczi wrote:
> This qemu-iotests test case is based on the reproducer that Jean-Louis
> Dupond<jean-louis@dupond.be> shared in
> https://gitlab.com/qemu-project/qemu/-/issues/3127.
>
> Signed-off-by: Stefan Hajnoczi<stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-10-07 15:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 14:16 [PATCH v2 0/3] block: use pwrite_zeroes_alignment when writing first sector Stefan Hajnoczi
2025-10-07 14:16 ` [PATCH v2 1/3] file-posix: populate pwrite_zeroes_alignment Stefan Hajnoczi
2025-10-07 15:13 ` Vladimir Sementsov-Ogievskiy
2025-10-07 14:16 ` [PATCH v2 2/3] block: use pwrite_zeroes_alignment when writing first sector Stefan Hajnoczi
2025-10-07 14:17 ` [PATCH v2 3/3] iotests: add Linux loop device image creation test Stefan Hajnoczi
2025-10-07 15:48 ` Vladimir Sementsov-Ogievskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).