* [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites
@ 2018-06-19 13:43 Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 1/8] block: Move two block permission constants to the relevant enum Ari Sundholm
` (8 more replies)
0 siblings, 9 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel; +Cc: Ari Sundholm, Kevin Wolf, Max Reitz, qemu-block
This patch series adds a new block driver, blklogwrites, to QEMU. The
driver is given two block devices: a raw device backed by an image or a
host block device, and a log device, typically backed by a file, on
which writes to the raw device are logged.
The logging format used is the same as in the dm-log-writes target of
the Linux kernel device mapper. The log reflects the writes that have
been performed on the guest block device and flushed. To be strict, the
log may contain writes that have not been flushed yet, but they are
technically outside the bounds of the log until the next flush updates
the metadata in the log superblock. We believe these semantics to be
useful even though they may not be completely identical to those of
dm-log-writes.
This functionality can be used for crash consistency and fs consistency
testing in filesystem drivers, including on non-Linux guests and older
guests running Linux versions without support for dm-log-writes. This
is simple and useful. Admittedly this and more advanced things could
perhaps be done by extending the quorum driver, but this approach would
require re-engineering the functionality and involve a more complicated
setup, so we offer this simple solution which we have found useful
internally.
In the first patch of this series, two block permission constants are
moved from block.c to include/block/block.h to make them available
outside of block.c. The next patch uses these constants.
In patches 2-7, a mechanism to pass the block configuration of a block
backend to a block driver is introduced, and this mechanism is
integrated in various hw/block drivers.
In patch 8, the blklogwrites driver is introduced. The driver uses the
mechanism from the previous patches to align all requests to the
logical sector size of the underlying block backend of the guest block
device being logged and use that size as a basis for offsets and sizes
in the log entries. This means that the sector size in the log metadata
will be set to the logical sector size, and offsets and sizes of writes
will be expressed as a multiple of that.
v5:
- Reorder patches and squash blklogwrites driver patches together
- Rebase on current upstream master
- No functional changes
v4:
- Check return value of snprintf() (flagged by patchew)
- Don't use C99 for loop initial declaration (flagged by patchew, but
wasn't QEMU supposed to be C99? Oh well.)
- Add proper "Since" fields for blklogwrites in block-core.json
- Rebase on current upstream master
v3:
- Rebase on current upstream master
v2:
- Incorporate review feedback
- Add patches to apply block configurations in more block device
drivers
*** BLURB HERE ***
Aapo Vienamo (1):
block: Add blklogwrites
Ari Sundholm (7):
block: Move two block permission constants to the relevant enum
block: Add a mechanism for passing a block driver a block
configuration
hw/scsi/scsi-disk: Always apply block configuration to block driver
hw/ide/qdev: Always apply block configuration to block driver
hw/block/virtio-blk: Always apply block configuration to block driver
hw/block/nvme: Always apply block configuration to block driver
hw/block/fdc: Always apply block configuration to block driver
MAINTAINERS | 6 +
block.c | 6 -
block/Makefile.objs | 1 +
block/blklogwrites.c | 426 ++++++++++++++++++++++++++++++++++++++++++++++
block/io.c | 22 +++
hw/block/block.c | 12 +-
hw/block/fdc.c | 4 +
hw/block/nvme.c | 1 +
hw/block/virtio-blk.c | 1 +
hw/ide/qdev.c | 2 +
hw/scsi/scsi-disk.c | 1 +
include/block/block.h | 17 ++
include/block/block_int.h | 9 +
include/hw/block/block.h | 1 +
qapi/block-core.json | 30 +++-
15 files changed, 526 insertions(+), 13 deletions(-)
create mode 100644 block/blklogwrites.c
--
2.7.4
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 1/8] block: Move two block permission constants to the relevant enum
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 2/8] block: Add a mechanism for passing a block driver a block configuration Ari Sundholm
` (7 subsequent siblings)
8 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel
Cc: Ari Sundholm, Kevin Wolf, Max Reitz, open list:Block layer core
This allows using the two constants outside of block.c, which will
happen in a subsequent patch.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
block.c | 6 ------
include/block/block.h | 7 +++++++
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/block.c b/block.c
index afe30ca..20d54fe 100644
--- a/block.c
+++ b/block.c
@@ -1926,12 +1926,6 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
return 0;
}
-#define DEFAULT_PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
- | BLK_PERM_WRITE \
- | BLK_PERM_WRITE_UNCHANGED \
- | BLK_PERM_RESIZE)
-#define DEFAULT_PERM_UNCHANGED (BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH)
-
void bdrv_filter_default_perms(BlockDriverState *bs, BdrvChild *c,
const BdrvChildRole *role,
BlockReopenQueue *reopen_queue,
diff --git a/include/block/block.h b/include/block/block.h
index e677080..6466ce0 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -225,6 +225,13 @@ enum {
BLK_PERM_GRAPH_MOD = 0x10,
BLK_PERM_ALL = 0x1f,
+
+ DEFAULT_PERM_PASSTHROUGH = BLK_PERM_CONSISTENT_READ
+ | BLK_PERM_WRITE
+ | BLK_PERM_WRITE_UNCHANGED
+ | BLK_PERM_RESIZE,
+
+ DEFAULT_PERM_UNCHANGED = BLK_PERM_ALL & ~DEFAULT_PERM_PASSTHROUGH,
};
char *bdrv_perm_names(uint64_t perm);
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 2/8] block: Add a mechanism for passing a block driver a block configuration
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 1/8] block: Move two block permission constants to the relevant enum Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-29 10:07 ` Kevin Wolf
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 3/8] hw/scsi/scsi-disk: Always apply block configuration to block driver Ari Sundholm
` (6 subsequent siblings)
8 siblings, 1 reply; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel
Cc: Ari Sundholm, Stefan Hajnoczi, Fam Zheng, Kevin Wolf, Max Reitz,
John Snow, open list:Block I/O path
A block driver may need to know about the block configuration, most
critically the sector sizes, of a block backend for alignment purposes
or for some other reason. It doesn't seem like qemu has an existing
mechanism for a block backend to convey the required information to
the relevant block driver when it is being realized.
The need for this mechanism rises from the fact that a drive may not
have a backend at the time it is created, as devices are created after
drives during qemu startup. Therefore, a driver requiring information
about the block configuration can get this information when a backend
is created for it at the earliest. The most natural place for this to
take place seems to be in the realization functions of the various
block device drivers, such as scsi-hd. The interface proposed here
allows the block driver to receive information about the block
configuration and the associated backend through a new callback.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
block/io.c | 22 ++++++++++++++++++++++
hw/block/block.c | 12 +++++++++++-
include/block/block.h | 10 ++++++++++
include/block/block_int.h | 9 +++++++++
include/hw/block/block.h | 1 +
5 files changed, 53 insertions(+), 1 deletion(-)
diff --git a/block/io.c b/block/io.c
index b7beaee..3a06843 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2932,3 +2932,25 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, uint64_t src_offset,
bdrv_dec_in_flight(dst_bs);
return ret;
}
+
+void bdrv_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
+{
+ BlockDriver *drv = bs->drv;
+
+ if (!drv) {
+ return;
+ }
+
+ if (drv->bdrv_apply_blkconf) {
+ drv->bdrv_apply_blkconf(bs, conf);
+ return;
+ }
+
+ if (bs->file && bs->file->bs) {
+ bdrv_apply_blkconf(bs->file->bs, conf);
+ }
+
+ if (bs->drv->supports_backing && backing_bs(bs)) {
+ bdrv_apply_blkconf(backing_bs(bs), conf);
+ }
+}
diff --git a/hw/block/block.c b/hw/block/block.c
index cf0eb82..607b4fc 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -25,7 +25,7 @@ void blkconf_blocksizes(BlockConf *conf)
/* fill in detected values if they are not defined via qemu command line */
if (!conf->physical_block_size) {
if (!backend_ret) {
- conf->physical_block_size = blocksizes.phys;
+ conf->physical_block_size = blocksizes.phys;
} else {
conf->physical_block_size = BDRV_SECTOR_SIZE;
}
@@ -39,6 +39,16 @@ void blkconf_blocksizes(BlockConf *conf)
}
}
+void blkconf_apply_to_blkdrv(BlockConf *conf)
+{
+ BlockBackend *blk = conf->blk;
+ BlockDriverState *bs = blk_bs(blk);
+
+ if (bs) {
+ bdrv_apply_blkconf(bs, conf);
+ }
+}
+
bool blkconf_apply_backend_options(BlockConf *conf, bool readonly,
bool resizable, Error **errp)
{
diff --git a/include/block/block.h b/include/block/block.h
index 6466ce0..358127a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -10,6 +10,7 @@
#include "block/dirty-bitmap.h"
#include "block/blockjob.h"
#include "qemu/hbitmap.h"
+#include "hw/block/block.h"
/* block.c */
typedef struct BlockDriver BlockDriver;
@@ -652,4 +653,13 @@ void bdrv_unregister_buf(BlockDriverState *bs, void *host);
int coroutine_fn bdrv_co_copy_range(BdrvChild *src, uint64_t src_offset,
BdrvChild *dst, uint64_t dst_offset,
uint64_t bytes, BdrvRequestFlags flags);
+
+/*
+ * bdrv_apply_blkconf:
+ *
+ * Recursively finds the highest-level block drivers among the files and
+ * backing files that accept a block configuration and applies the given block
+ * configuration to them.
+ */
+void bdrv_apply_blkconf(BlockDriverState *bs, BlockConf *conf);
#endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 327e478..74c470e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -465,6 +465,15 @@ struct BlockDriver {
void (*bdrv_abort_perm_update)(BlockDriverState *bs);
/**
+ * Called to inform the driver of the block configuration of the virtual
+ * block device.
+ *
+ * This function is called by a block device realization function if the
+ * device wants to inform the block driver of its block configuration.
+ */
+ void (*bdrv_apply_blkconf)(BlockDriverState *bs, BlockConf *conf);
+
+ /**
* Returns in @nperm and @nshared the permissions that the driver for @bs
* needs on its child @c, based on the cumulative permissions requested by
* the parents in @parent_perm and @parent_shared.
diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index e9f9e22..de74d4f 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -76,6 +76,7 @@ bool blkconf_geometry(BlockConf *conf, int *trans,
unsigned cyls_max, unsigned heads_max, unsigned secs_max,
Error **errp);
void blkconf_blocksizes(BlockConf *conf);
+void blkconf_apply_to_blkdrv(BlockConf *conf);
bool blkconf_apply_backend_options(BlockConf *conf, bool readonly,
bool resizable, Error **errp);
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 3/8] hw/scsi/scsi-disk: Always apply block configuration to block driver
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 1/8] block: Move two block permission constants to the relevant enum Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 2/8] block: Add a mechanism for passing a block driver a block configuration Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 4/8] hw/ide/qdev: " Ari Sundholm
` (5 subsequent siblings)
8 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel; +Cc: Ari Sundholm, Paolo Bonzini, Fam Zheng
This allows the block driver to use the block configuration of the new
SCSI device. One use for this information is to set request limits
using this information.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
hw/scsi/scsi-disk.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index aeaf611..c6efc4e 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2412,6 +2412,7 @@ static void scsi_realize(SCSIDevice *dev, Error **errp)
blk_set_dev_ops(s->qdev.conf.blk, &scsi_disk_block_ops, s);
}
blk_set_guest_block_size(s->qdev.conf.blk, s->qdev.blocksize);
+ blkconf_apply_to_blkdrv(&s->qdev.conf);
blk_iostatus_enable(s->qdev.conf.blk);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 4/8] hw/ide/qdev: Always apply block configuration to block driver
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
` (2 preceding siblings ...)
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 3/8] hw/scsi/scsi-disk: Always apply block configuration to block driver Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 5/8] hw/block/virtio-blk: " Ari Sundholm
` (4 subsequent siblings)
8 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel; +Cc: Ari Sundholm, John Snow, open list:IDE
This allows the block driver to use the block configuration of the new
IDE device. One use for this information is to set request limits
using this information.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
hw/ide/qdev.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 573b022..65d862a 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -213,6 +213,8 @@ static void ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind, Error **errp)
dev->serial = g_strdup(s->drive_serial_str);
}
+ blkconf_apply_to_blkdrv(&dev->conf);
+
add_boot_device_path(dev->conf.bootindex, &dev->qdev,
dev->unit ? "/disk@1" : "/disk@0");
}
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 5/8] hw/block/virtio-blk: Always apply block configuration to block driver
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
` (3 preceding siblings ...)
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 4/8] hw/ide/qdev: " Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 6/8] hw/block/nvme: " Ari Sundholm
` (3 subsequent siblings)
8 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel
Cc: Ari Sundholm, Stefan Hajnoczi, Michael S. Tsirkin, Kevin Wolf,
Max Reitz, open list:virtio-blk
This allows the block driver to use the block configuration of the new
VirtIO block device. One use for this information is to set request
limits using this information.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
hw/block/virtio-blk.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 225fe44..d5fb43b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -974,6 +974,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
s->change = qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s);
blk_set_dev_ops(s->blk, &virtio_block_ops, s);
blk_set_guest_block_size(s->blk, s->conf.conf.logical_block_size);
+ blkconf_apply_to_blkdrv(&s->conf.conf);
blk_iostatus_enable(s->blk);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 6/8] hw/block/nvme: Always apply block configuration to block driver
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
` (4 preceding siblings ...)
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 5/8] hw/block/virtio-blk: " Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 7/8] hw/block/fdc: " Ari Sundholm
` (2 subsequent siblings)
8 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel
Cc: Ari Sundholm, Keith Busch, Kevin Wolf, Max Reitz, open list:nvme
This allows the block driver to use the block configuration of the new
NVMe device. One use for this information is to set request limits
using this information.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
hw/block/nvme.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d5bf95b..21c3191 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1224,6 +1224,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
false, errp)) {
return;
}
+ blkconf_apply_to_blkdrv(&n->conf);
pci_conf = pci_dev->config;
pci_conf[PCI_INTERRUPT_PIN] = 1;
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 7/8] hw/block/fdc: Always apply block configuration to block driver
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
` (5 preceding siblings ...)
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 6/8] hw/block/nvme: " Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites Ari Sundholm
2018-06-26 18:06 ` [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
8 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel
Cc: Ari Sundholm, John Snow, Kevin Wolf, Max Reitz, open list:Floppy
This allows the block driver to use the block configuration of the new
floppy device. One use for this information is to set request limits
using this information.
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
hw/block/fdc.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index cd29e27..b11eeda 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -482,6 +482,8 @@ static void fd_change_cb(void *opaque, bool load, Error **errp)
errp)) {
return;
}
+
+ blkconf_apply_to_blkdrv(drive->conf);
}
drive->media_changed = 1;
@@ -594,6 +596,8 @@ static void floppy_drive_realize(DeviceState *qdev, Error **errp)
pick_drive_type(drive);
dev->type = drive->drive;
+ blkconf_apply_to_blkdrv(&dev->conf);
+
fd_revalidate(drive);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
` (6 preceding siblings ...)
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 7/8] hw/block/fdc: " Ari Sundholm
@ 2018-06-19 13:43 ` Ari Sundholm
2018-06-29 12:05 ` Kevin Wolf
2018-06-26 18:06 ` [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
8 siblings, 1 reply; 15+ messages in thread
From: Ari Sundholm @ 2018-06-19 13:43 UTC (permalink / raw)
To: qemu-devel
Cc: Aapo Vienamo, Ari Sundholm, Kevin Wolf, Max Reitz,
Markus Armbruster, Eric Blake, open list:Block layer core
From: Aapo Vienamo <aapo@tuxera.com>
Implements a block device write logging system, similar to Linux kernel
device mapper dm-log-writes. The write operations that are performed
on a block device are logged to a file or another block device. The
write log format is identical to the dm-log-writes format. Currently,
log markers are not supported.
This functionality can be used for crash consistency and fs consistency
testing. By implementing it in qemu, tests utilizing write logs can be
be used to test non-Linux drivers and older kernels.
The driver requires all requests to be aligned to the logical sector
size of the block backend of the guest block device being logged. This
is important because the dm-log-writes log format has a granularity of
one sector for both the offset and the size of each write. Also, using
the logical sector size as a basis for logging allows for faithful
logging of writes when testing drivers with block devices that have
unusual sector sizes.
The implementation is based on the blkverify and blkdebug block
drivers.
Signed-off-by: Aapo Vienamo <aapo@tuxera.com>
Signed-off-by: Ari Sundholm <ari@tuxera.com>
---
MAINTAINERS | 6 +
block/Makefile.objs | 1 +
block/blklogwrites.c | 426 +++++++++++++++++++++++++++++++++++++++++++++++++++
qapi/block-core.json | 30 +++-
4 files changed, 457 insertions(+), 6 deletions(-)
create mode 100644 block/blklogwrites.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 0fb5f38..f824d2a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2021,6 +2021,12 @@ S: Supported
F: block/quorum.c
L: qemu-block@nongnu.org
+blklogwrites
+M: Ari Sundholm <ari@tuxera.com>
+L: qemu-block@nongnu.org
+S: Supported
+F: block/blklogwrites.c
+
blkverify
M: Stefan Hajnoczi <stefanha@redhat.com>
L: qemu-block@nongnu.org
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 899bfb5..c8337bf 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -5,6 +5,7 @@ block-obj-y += qed-check.o
block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-y += quorum.o
block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o
+block-obj-y += blklogwrites.o
block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += file-posix.o
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
new file mode 100644
index 0000000..decf5e5
--- /dev/null
+++ b/block/blklogwrites.c
@@ -0,0 +1,426 @@
+/*
+ * Write logging blk driver based on blkverify and blkdebug.
+ *
+ * Copyright (c) 2017 Tuomas Tynkkynen <tuomas@tuxera.com>
+ * Copyright (c) 2018 Aapo Vienamo <aapo@tuxera.com>
+ * Copyright (c) 2018 Ari Sundholm <ari@tuxera.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
+#include "block/block_int.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/cutils.h"
+#include "qemu/option.h"
+
+/* Disk format stuff - taken from Linux drivers/md/dm-log-writes.c */
+
+#define LOG_FLUSH_FLAG (1 << 0)
+#define LOG_FUA_FLAG (1 << 1)
+#define LOG_DISCARD_FLAG (1 << 2)
+#define LOG_MARK_FLAG (1 << 3)
+
+#define WRITE_LOG_VERSION 1ULL
+#define WRITE_LOG_MAGIC 0x6a736677736872ULL
+
+/* All fields are little-endian. */
+struct log_write_super {
+ uint64_t magic;
+ uint64_t version;
+ uint64_t nr_entries;
+ uint32_t sectorsize;
+};
+
+struct log_write_entry {
+ uint64_t sector;
+ uint64_t nr_sectors;
+ uint64_t flags;
+ uint64_t data_len;
+};
+
+/* End of disk format structures. */
+
+typedef struct {
+ BdrvChild *log_file;
+ uint32_t sectorsize;
+ uint32_t sectorbits;
+ uint64_t cur_log_sector;
+ uint64_t nr_entries;
+} BDRVBlkLogWritesState;
+
+static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags,
+ Error **errp)
+{
+ BDRVBlkLogWritesState *s = bs->opaque;
+ Error *local_err = NULL;
+ int ret;
+
+ /* Open the raw file */
+ bs->file = bdrv_open_child(NULL, options, "raw", bs, &child_file, false,
+ &local_err);
+ if (local_err) {
+ ret = -EINVAL;
+ error_propagate(errp, local_err);
+ goto fail;
+ }
+
+ s->sectorsize = BDRV_SECTOR_SIZE; /* May be updated later */
+ s->sectorbits = BDRV_SECTOR_BITS;
+ s->cur_log_sector = 1;
+ s->nr_entries = 0;
+
+ /* Open the log file */
+ s->log_file = bdrv_open_child(NULL, options, "log", bs, &child_file, false,
+ &local_err);
+ if (local_err) {
+ ret = -EINVAL;
+ error_propagate(errp, local_err);
+ goto fail;
+ }
+
+ ret = 0;
+fail:
+ if (ret < 0) {
+ bdrv_unref_child(bs, bs->file);
+ bs->file = NULL;
+ }
+ return ret;
+}
+
+static void blk_log_writes_close(BlockDriverState *bs)
+{
+ BDRVBlkLogWritesState *s = bs->opaque;
+
+ bdrv_unref_child(bs, s->log_file);
+ s->log_file = NULL;
+}
+
+static int64_t blk_log_writes_getlength(BlockDriverState *bs)
+{
+ return bdrv_getlength(bs->file->bs);
+}
+
+static void blk_log_writes_refresh_filename(BlockDriverState *bs,
+ QDict *options)
+{
+ BDRVBlkLogWritesState *s = bs->opaque;
+
+ /* bs->file->bs has already been refreshed */
+ bdrv_refresh_filename(s->log_file->bs);
+
+ if (bs->file->bs->full_open_options
+ && s->log_file->bs->full_open_options)
+ {
+ QDict *opts = qdict_new();
+ qdict_put_obj(opts, "driver",
+ QOBJECT(qstring_from_str("blklogwrites")));
+
+ qobject_ref(bs->file->bs->full_open_options);
+ qdict_put_obj(opts, "raw", QOBJECT(bs->file->bs->full_open_options));
+ qobject_ref(s->log_file->bs->full_open_options);
+ qdict_put_obj(opts, "log",
+ QOBJECT(s->log_file->bs->full_open_options));
+
+ bs->full_open_options = opts;
+ }
+
+ if (bs->file->bs->exact_filename[0]
+ && s->log_file->bs->exact_filename[0])
+ {
+ int ret = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
+ "blklogwrites:%s:%s",
+ bs->file->bs->exact_filename,
+ s->log_file->bs->exact_filename);
+
+ if (ret >= sizeof(bs->exact_filename)) {
+ /* An overflow makes the filename unusable, so do not report any */
+ bs->exact_filename[0] = '\0';
+ }
+ }
+}
+
+static void blk_log_writes_child_perm(BlockDriverState *bs, BdrvChild *c,
+ const BdrvChildRole *role,
+ BlockReopenQueue *ro_q,
+ uint64_t perm, uint64_t shrd,
+ uint64_t *nperm, uint64_t *nshrd)
+{
+ if (!c) {
+ *nperm = perm & DEFAULT_PERM_PASSTHROUGH;
+ *nshrd = (shrd & DEFAULT_PERM_PASSTHROUGH) | DEFAULT_PERM_UNCHANGED;
+ return;
+ }
+
+ if (!strcmp(c->name, "log")) {
+ bdrv_format_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
+ } else {
+ bdrv_filter_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
+ }
+}
+
+static void blk_log_writes_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+ if (bs->bl.request_alignment < BDRV_SECTOR_SIZE) {
+ bs->bl.request_alignment = BDRV_SECTOR_SIZE;
+
+ if (bs->bl.pdiscard_alignment &&
+ bs->bl.pdiscard_alignment < bs->bl.request_alignment)
+ bs->bl.pdiscard_alignment = bs->bl.request_alignment;
+ if (bs->bl.pwrite_zeroes_alignment &&
+ bs->bl.pwrite_zeroes_alignment < bs->bl.request_alignment)
+ bs->bl.pwrite_zeroes_alignment = bs->bl.request_alignment;
+ }
+}
+
+static inline uint32_t blk_log_writes_log2(uint32_t value)
+{
+ assert(value > 0);
+ return 31 - clz32(value);
+}
+
+static void blk_log_writes_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
+{
+ BDRVBlkLogWritesState *s = bs->opaque;
+ assert(bs && conf && conf->blk && s);
+
+ s->sectorsize = conf->logical_block_size;
+ s->sectorbits = blk_log_writes_log2(s->sectorsize);
+ bs->bl.request_alignment = s->sectorsize;
+ if (conf->discard_granularity != (uint32_t)-1) {
+ bs->bl.pdiscard_alignment = conf->discard_granularity;
+ }
+
+ if (bs->bl.pdiscard_alignment &&
+ bs->bl.pdiscard_alignment < bs->bl.request_alignment) {
+ bs->bl.pdiscard_alignment = bs->bl.request_alignment;
+ }
+ if (bs->bl.pwrite_zeroes_alignment &&
+ bs->bl.pwrite_zeroes_alignment < bs->bl.request_alignment) {
+ bs->bl.pwrite_zeroes_alignment = bs->bl.request_alignment;
+ }
+}
+
+static int coroutine_fn
+blk_log_writes_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, int flags)
+{
+ return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
+}
+
+typedef struct BlkLogWritesFileReq {
+ BlockDriverState *bs;
+ uint64_t offset;
+ uint64_t bytes;
+ int file_flags;
+ QEMUIOVector *qiov;
+ int (*func)(struct BlkLogWritesFileReq *r);
+ int file_ret;
+} BlkLogWritesFileReq;
+
+typedef struct {
+ BlockDriverState *bs;
+ QEMUIOVector *qiov;
+ struct log_write_entry entry;
+ uint64_t zero_size;
+ int log_ret;
+} BlkLogWritesLogReq;
+
+static void coroutine_fn blk_log_writes_co_do_log(BlkLogWritesLogReq *lr)
+{
+ BDRVBlkLogWritesState *s = lr->bs->opaque;
+ uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;
+
+ s->nr_entries++;
+ s->cur_log_sector +=
+ ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
+
+ lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
+ lr->qiov, 0);
+
+ /* Logging for the "write zeroes" operation */
+ if (lr->log_ret == 0 && lr->zero_size) {
+ cur_log_offset = s->cur_log_sector << s->sectorbits;
+ s->cur_log_sector +=
+ ROUND_UP(lr->zero_size, s->sectorsize) >> s->sectorbits;
+
+ lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, cur_log_offset,
+ lr->zero_size, 0);
+ }
+
+ /* Update super block on flush */
+ if (lr->log_ret == 0 && lr->entry.flags & LOG_FLUSH_FLAG) {
+ struct log_write_super super = {
+ .magic = cpu_to_le64(WRITE_LOG_MAGIC),
+ .version = cpu_to_le64(WRITE_LOG_VERSION),
+ .nr_entries = cpu_to_le64(s->nr_entries),
+ .sectorsize = cpu_to_le32(s->sectorsize),
+ };
+ void *zeroes = g_malloc0(s->sectorsize - sizeof(super));
+ QEMUIOVector qiov;
+
+ qemu_iovec_init(&qiov, 2);
+ qemu_iovec_add(&qiov, &super, sizeof(super));
+ qemu_iovec_add(&qiov, zeroes, s->sectorsize - sizeof(super));
+
+ lr->log_ret =
+ bdrv_co_pwritev(s->log_file, 0, s->sectorsize, &qiov, 0);
+ if (lr->log_ret == 0) {
+ lr->log_ret = bdrv_co_flush(s->log_file->bs);
+ }
+ qemu_iovec_destroy(&qiov);
+ g_free(zeroes);
+ }
+}
+
+static void coroutine_fn blk_log_writes_co_do_file(BlkLogWritesFileReq *fr)
+{
+ fr->file_ret = fr->func(fr);
+}
+
+static int coroutine_fn
+blk_log_writes_co_log(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, int flags,
+ int (*file_func)(BlkLogWritesFileReq *r),
+ uint64_t entry_flags, bool is_zero_write)
+{
+ QEMUIOVector log_qiov;
+ size_t niov = qiov ? qiov->niov : 0;
+ size_t i;
+ BDRVBlkLogWritesState *s = bs->opaque;
+ BlkLogWritesFileReq fr = {
+ .bs = bs,
+ .offset = offset,
+ .bytes = bytes,
+ .file_flags = flags,
+ .qiov = qiov,
+ .func = file_func,
+ };
+ BlkLogWritesLogReq lr = {
+ .bs = bs,
+ .qiov = &log_qiov,
+ .entry = {
+ .sector = cpu_to_le64(offset >> s->sectorbits),
+ .nr_sectors = cpu_to_le64(bytes >> s->sectorbits),
+ .flags = cpu_to_le64(entry_flags),
+ .data_len = 0,
+ },
+ .zero_size = is_zero_write ? bytes : 0,
+ };
+ void *zeroes = g_malloc0(s->sectorsize - sizeof(lr.entry));
+
+ assert((1 << s->sectorbits) == s->sectorsize);
+ assert(bs->bl.request_alignment == s->sectorsize);
+ assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
+ assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
+
+ qemu_iovec_init(&log_qiov, niov + 2);
+ qemu_iovec_add(&log_qiov, &lr.entry, sizeof(lr.entry));
+ qemu_iovec_add(&log_qiov, zeroes, s->sectorsize - sizeof(lr.entry));
+ for (i = 0; i < niov; ++i) {
+ qemu_iovec_add(&log_qiov, qiov->iov[i].iov_base, qiov->iov[i].iov_len);
+ }
+
+ blk_log_writes_co_do_file(&fr);
+ blk_log_writes_co_do_log(&lr);
+
+ qemu_iovec_destroy(&log_qiov);
+ g_free(zeroes);
+
+ if (lr.log_ret < 0) {
+ return lr.log_ret;
+ }
+
+ return fr.file_ret;
+}
+
+static int coroutine_fn
+blk_log_writes_co_do_file_pwritev(BlkLogWritesFileReq *fr)
+{
+ return bdrv_co_pwritev(fr->bs->file, fr->offset, fr->bytes,
+ fr->qiov, fr->file_flags);
+}
+
+static int coroutine_fn
+blk_log_writes_co_do_file_pwrite_zeroes(BlkLogWritesFileReq *fr)
+{
+ return bdrv_co_pwrite_zeroes(fr->bs->file, fr->offset, fr->bytes,
+ fr->file_flags);
+}
+
+static int coroutine_fn blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr)
+{
+ return bdrv_co_flush(fr->bs->file->bs);
+}
+
+static int coroutine_fn
+blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr)
+{
+ return bdrv_co_pdiscard(fr->bs->file->bs, fr->offset, fr->bytes);
+}
+
+static int coroutine_fn
+blk_log_writes_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, int flags)
+{
+ return blk_log_writes_co_log(bs, offset, bytes, qiov, flags,
+ blk_log_writes_co_do_file_pwritev, 0, false);
+}
+
+static int coroutine_fn
+blk_log_writes_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
+ BdrvRequestFlags flags)
+{
+ return blk_log_writes_co_log(bs, offset, bytes, NULL, flags,
+ blk_log_writes_co_do_file_pwrite_zeroes, 0,
+ true);
+}
+
+static int coroutine_fn blk_log_writes_co_flush_to_disk(BlockDriverState *bs)
+{
+ return blk_log_writes_co_log(bs, 0, 0, NULL, 0,
+ blk_log_writes_co_do_file_flush,
+ LOG_FLUSH_FLAG, false);
+}
+
+static int coroutine_fn
+blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
+{
+ return blk_log_writes_co_log(bs, offset, count, NULL, 0,
+ blk_log_writes_co_do_file_pdiscard,
+ LOG_DISCARD_FLAG, false);
+}
+
+static BlockDriver bdrv_blk_log_writes = {
+ .format_name = "blklogwrites",
+ .protocol_name = "blklogwrites",
+ .instance_size = sizeof(BDRVBlkLogWritesState),
+
+ .bdrv_file_open = blk_log_writes_open,
+ .bdrv_close = blk_log_writes_close,
+ .bdrv_getlength = blk_log_writes_getlength,
+ .bdrv_refresh_filename = blk_log_writes_refresh_filename,
+ .bdrv_child_perm = blk_log_writes_child_perm,
+ .bdrv_refresh_limits = blk_log_writes_refresh_limits,
+ .bdrv_apply_blkconf = blk_log_writes_apply_blkconf,
+
+ .bdrv_co_preadv = blk_log_writes_co_preadv,
+ .bdrv_co_pwritev = blk_log_writes_co_pwritev,
+ .bdrv_co_pwrite_zeroes = blk_log_writes_co_pwrite_zeroes,
+ .bdrv_co_flush_to_disk = blk_log_writes_co_flush_to_disk,
+ .bdrv_co_pdiscard = blk_log_writes_co_pdiscard,
+ .bdrv_co_block_status = bdrv_co_block_status_from_file,
+
+ .is_filter = true,
+};
+
+static void bdrv_blk_log_writes_init(void)
+{
+ bdrv_register(&bdrv_blk_log_writes);
+}
+
+block_init(bdrv_blk_log_writes_init);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ab629d1..b4255d2 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2509,16 +2509,17 @@
# @throttle: Since 2.11
# @nvme: Since 2.12
# @copy-on-read: Since 3.0
+# @blklogwrites: Since 3.0
#
# Since: 2.9
##
{ 'enum': 'BlockdevDriver',
- 'data': [ 'blkdebug', 'blkverify', 'bochs', 'cloop', 'copy-on-read',
- 'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
- 'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs',
- 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow', 'qcow2', 'qed',
- 'quorum', 'raw', 'rbd', 'replication', 'sheepdog', 'ssh',
- 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
+ 'data': [ 'blkdebug', 'blklogwrites', 'blkverify', 'bochs', 'cloop',
+ 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps', 'gluster',
+ 'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks',
+ 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow',
+ 'qcow2', 'qed', 'quorum', 'raw', 'rbd', 'replication', 'sheepdog',
+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] }
##
# @BlockdevOptionsFile:
@@ -3033,6 +3034,21 @@
'*set-state': ['BlkdebugSetStateOptions'] } }
##
+# @BlockdevOptionsBlklogwrites:
+#
+# Driver specific block device options for blklogwrites.
+#
+# @raw: block device
+#
+# @log: block device used to log writes on @raw
+#
+# Since: 3.0
+##
+{ 'struct': 'BlockdevOptionsBlklogwrites',
+ 'data': { 'raw': 'BlockdevRef',
+ 'log': 'BlockdevRef' } }
+
+##
# @BlockdevOptionsBlkverify:
#
# Driver specific block device options for blkverify.
@@ -3546,6 +3562,7 @@
'discriminator': 'driver',
'data': {
'blkdebug': 'BlockdevOptionsBlkdebug',
+ 'blklogwrites': 'BlockdevOptionsBlklogwrites',
'blkverify': 'BlockdevOptionsBlkverify',
'bochs': 'BlockdevOptionsGenericFormat',
'cloop': 'BlockdevOptionsGenericFormat',
@@ -4074,6 +4091,7 @@
'discriminator': 'driver',
'data': {
'blkdebug': 'BlockdevCreateNotSupported',
+ 'blklogwrites': 'BlockdevCreateNotSupported',
'blkverify': 'BlockdevCreateNotSupported',
'bochs': 'BlockdevCreateNotSupported',
'cloop': 'BlockdevCreateNotSupported',
--
2.7.4
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
` (7 preceding siblings ...)
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites Ari Sundholm
@ 2018-06-26 18:06 ` Ari Sundholm
8 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-26 18:06 UTC (permalink / raw)
To: qemu-devel; +Cc: Kevin Wolf, Max Reitz, qemu-block
Just a weekly ping. Any comments appreciated.
On 06/19/2018 04:43 PM, Ari Sundholm wrote:
> This patch series adds a new block driver, blklogwrites, to QEMU. The
> driver is given two block devices: a raw device backed by an image or a
> host block device, and a log device, typically backed by a file, on
> which writes to the raw device are logged.
>
> The logging format used is the same as in the dm-log-writes target of
> the Linux kernel device mapper. The log reflects the writes that have
> been performed on the guest block device and flushed. To be strict, the
> log may contain writes that have not been flushed yet, but they are
> technically outside the bounds of the log until the next flush updates
> the metadata in the log superblock. We believe these semantics to be
> useful even though they may not be completely identical to those of
> dm-log-writes.
>
> This functionality can be used for crash consistency and fs consistency
> testing in filesystem drivers, including on non-Linux guests and older
> guests running Linux versions without support for dm-log-writes. This
> is simple and useful. Admittedly this and more advanced things could
> perhaps be done by extending the quorum driver, but this approach would
> require re-engineering the functionality and involve a more complicated
> setup, so we offer this simple solution which we have found useful
> internally.
>
> In the first patch of this series, two block permission constants are
> moved from block.c to include/block/block.h to make them available
> outside of block.c. The next patch uses these constants.
>
> In patches 2-7, a mechanism to pass the block configuration of a block
> backend to a block driver is introduced, and this mechanism is
> integrated in various hw/block drivers.
>
> In patch 8, the blklogwrites driver is introduced. The driver uses the
> mechanism from the previous patches to align all requests to the
> logical sector size of the underlying block backend of the guest block
> device being logged and use that size as a basis for offsets and sizes
> in the log entries. This means that the sector size in the log metadata
> will be set to the logical sector size, and offsets and sizes of writes
> will be expressed as a multiple of that.
>
> v5:
> - Reorder patches and squash blklogwrites driver patches together
> - Rebase on current upstream master
> - No functional changes
>
> v4:
> - Check return value of snprintf() (flagged by patchew)
> - Don't use C99 for loop initial declaration (flagged by patchew, but
> wasn't QEMU supposed to be C99? Oh well.)
> - Add proper "Since" fields for blklogwrites in block-core.json
> - Rebase on current upstream master
>
> v3:
> - Rebase on current upstream master
>
> v2:
> - Incorporate review feedback
> - Add patches to apply block configurations in more block device
> drivers
>
>
> *** BLURB HERE ***
>
> Aapo Vienamo (1):
> block: Add blklogwrites
>
> Ari Sundholm (7):
> block: Move two block permission constants to the relevant enum
> block: Add a mechanism for passing a block driver a block
> configuration
> hw/scsi/scsi-disk: Always apply block configuration to block driver
> hw/ide/qdev: Always apply block configuration to block driver
> hw/block/virtio-blk: Always apply block configuration to block driver
> hw/block/nvme: Always apply block configuration to block driver
> hw/block/fdc: Always apply block configuration to block driver
>
> MAINTAINERS | 6 +
> block.c | 6 -
> block/Makefile.objs | 1 +
> block/blklogwrites.c | 426 ++++++++++++++++++++++++++++++++++++++++++++++
> block/io.c | 22 +++
> hw/block/block.c | 12 +-
> hw/block/fdc.c | 4 +
> hw/block/nvme.c | 1 +
> hw/block/virtio-blk.c | 1 +
> hw/ide/qdev.c | 2 +
> hw/scsi/scsi-disk.c | 1 +
> include/block/block.h | 17 ++
> include/block/block_int.h | 9 +
> include/hw/block/block.h | 1 +
> qapi/block-core.json | 30 +++-
> 15 files changed, 526 insertions(+), 13 deletions(-)
> create mode 100644 block/blklogwrites.c
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/8] block: Add a mechanism for passing a block driver a block configuration
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 2/8] block: Add a mechanism for passing a block driver a block configuration Ari Sundholm
@ 2018-06-29 10:07 ` Kevin Wolf
2018-06-29 15:36 ` Ari Sundholm
0 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-29 10:07 UTC (permalink / raw)
To: Ari Sundholm
Cc: qemu-devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, John Snow,
open list:Block I/O path
Am 19.06.2018 um 15:43 hat Ari Sundholm geschrieben:
> A block driver may need to know about the block configuration, most
> critically the sector sizes, of a block backend for alignment purposes
> or for some other reason.
It makes sense to me that you need to know alignment constraints of the
thing you are calling (i.e. your child nodes), but I don't really see
why you would need to know anything about the parent nodes. I'm doubtful
that this is a good thing to add.
I hope that the rest of the series will tell me why you are wanting this
information, but if we keep it, it needs a better explanation in the
commit message.
> It doesn't seem like qemu has an existing
> mechanism for a block backend to convey the required information to
> the relevant block driver when it is being realized.
>
> The need for this mechanism rises from the fact that a drive may not
> have a backend at the time it is created, as devices are created after
> drives during qemu startup. Therefore, a driver requiring information
> about the block configuration can get this information when a backend
> is created for it at the earliest. The most natural place for this to
> take place seems to be in the realization functions of the various
> block device drivers, such as scsi-hd. The interface proposed here
> allows the block driver to receive information about the block
> configuration and the associated backend through a new callback.
>
> Signed-off-by: Ari Sundholm <ari@tuxera.com>
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 327e478..74c470e 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -465,6 +465,15 @@ struct BlockDriver {
> void (*bdrv_abort_perm_update)(BlockDriverState *bs);
>
> /**
> + * Called to inform the driver of the block configuration of the virtual
> + * block device.
> + *
> + * This function is called by a block device realization function if the
> + * device wants to inform the block driver of its block configuration.
> + */
> + void (*bdrv_apply_blkconf)(BlockDriverState *bs, BlockConf *conf);
This interface can't really work. Block nodes (BlockDriverStates) can
have an arbitrary number of parents, which can be attached and detached
dynamically. This interface basically assumes that there is one device
attached to the node, it will always stay attached and its configuration
stays the same.
Is this function supposed to be called only once? (That assumption
wouldn't hold true.) If not, does a second call mean that a second
parent was attached, or does it update the information of the existing
parent?
How is this information useful when one parent calls the function and
provides BlockConf, but the other doesn't and the block driver doesn't
know about this?
> diff --git a/block/io.c b/block/io.c
> index b7beaee..3a06843 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2932,3 +2932,25 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, uint64_t src_offset,
> bdrv_dec_in_flight(dst_bs);
> return ret;
> }
> +
> +void bdrv_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
> +{
> + BlockDriver *drv = bs->drv;
> +
> + if (!drv) {
> + return;
> + }
> +
> + if (drv->bdrv_apply_blkconf) {
> + drv->bdrv_apply_blkconf(bs, conf);
> + return;
> + }
> +
> + if (bs->file && bs->file->bs) {
> + bdrv_apply_blkconf(bs->file->bs, conf);
> + }
> +
> + if (bs->drv->supports_backing && backing_bs(bs)) {
> + bdrv_apply_blkconf(backing_bs(bs), conf);
> + }
> +}
You probably want to propagate to all children instead of singling out
bs->file and bs->backing.
Kevin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites Ari Sundholm
@ 2018-06-29 12:05 ` Kevin Wolf
2018-06-29 16:02 ` Ari Sundholm
0 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-29 12:05 UTC (permalink / raw)
To: Ari Sundholm
Cc: qemu-devel, Aapo Vienamo, Max Reitz, Markus Armbruster,
Eric Blake, open list:Block layer core
Am 19.06.2018 um 15:43 hat Ari Sundholm geschrieben:
> From: Aapo Vienamo <aapo@tuxera.com>
>
> Implements a block device write logging system, similar to Linux kernel
> device mapper dm-log-writes. The write operations that are performed
> on a block device are logged to a file or another block device. The
> write log format is identical to the dm-log-writes format. Currently,
> log markers are not supported.
>
> This functionality can be used for crash consistency and fs consistency
> testing. By implementing it in qemu, tests utilizing write logs can be
> be used to test non-Linux drivers and older kernels.
>
> The driver requires all requests to be aligned to the logical sector
> size of the block backend of the guest block device being logged. This
> is important because the dm-log-writes log format has a granularity of
> one sector for both the offset and the size of each write. Also, using
> the logical sector size as a basis for logging allows for faithful
> logging of writes when testing drivers with block devices that have
> unusual sector sizes.
This is backwards, the format of the image file can't depend on how the
upper layer presents the image to the guest.
In fact, logging the writes of an image format driver is likely to
immediately corrupt the log when the guest device gets active because
the format driver may already have written data to the image (with the
default log sector size) before bdrv_apply_blkconf() was called, so you
mix data of two different sector sizes, but the super block only records
the latest one.
So I think you need to make the log sector size an option of the
blklogwrites driver and set that as bs->bl.request_alignment in
.bdrv_refresh_limits. 512 looks like a safe default which will even work
when the guest uses a larger sector size; if the user chooses a log
sector size that is larget than the guest device sector size, QEMU's
block layer may need to internally perform a read-modify-write cycle.
This is probably not very interesting for testing guest driver (where
you'll want log sector size <= guest device sector size), but it can be
useful to test QEMU itself.
> The implementation is based on the blkverify and blkdebug block
> drivers.
>
> Signed-off-by: Aapo Vienamo <aapo@tuxera.com>
> Signed-off-by: Ari Sundholm <ari@tuxera.com>
> --- /dev/null
> +++ b/block/blklogwrites.c
> @@ -0,0 +1,426 @@
> +/*
> + * Write logging blk driver based on blkverify and blkdebug.
> + *
> + * Copyright (c) 2017 Tuomas Tynkkynen <tuomas@tuxera.com>
> + * Copyright (c) 2018 Aapo Vienamo <aapo@tuxera.com>
> + * Copyright (c) 2018 Ari Sundholm <ari@tuxera.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
> +#include "block/block_int.h"
> +#include "qapi/qmp/qdict.h"
> +#include "qapi/qmp/qstring.h"
> +#include "qemu/cutils.h"
> +#include "qemu/option.h"
> +
> +/* Disk format stuff - taken from Linux drivers/md/dm-log-writes.c */
> +
> +#define LOG_FLUSH_FLAG (1 << 0)
> +#define LOG_FUA_FLAG (1 << 1)
> +#define LOG_DISCARD_FLAG (1 << 2)
> +#define LOG_MARK_FLAG (1 << 3)
I'd align the values on the same column.
> +#define WRITE_LOG_VERSION 1ULL
> +#define WRITE_LOG_MAGIC 0x6a736677736872ULL
> +
> +/* All fields are little-endian. */
> +struct log_write_super {
> + uint64_t magic;
> + uint64_t version;
> + uint64_t nr_entries;
> + uint32_t sectorsize;
> +};
> +
> +struct log_write_entry {
> + uint64_t sector;
> + uint64_t nr_sectors;
> + uint64_t flags;
> + uint64_t data_len;
> +};
It shouldn't make a difference semantically because the struct fields
are naturally aligned, but may it's worth adding QEMU_PACKED for clarity
to human readers?
> +/* End of disk format structures. */
> +
> +typedef struct {
> + BdrvChild *log_file;
> + uint32_t sectorsize;
> + uint32_t sectorbits;
> + uint64_t cur_log_sector;
> + uint64_t nr_entries;
> +} BDRVBlkLogWritesState;
> +
> +static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags,
> + Error **errp)
> +{
> + BDRVBlkLogWritesState *s = bs->opaque;
> + Error *local_err = NULL;
> + int ret;
> +
> + /* Open the raw file */
> + bs->file = bdrv_open_child(NULL, options, "raw", bs, &child_file, false,
> + &local_err);
"file" might be a better name than "raw" because it's consistent with
other drivers and we may in fact use a non-raw image format for this
child.
> + if (local_err) {
> + ret = -EINVAL;
> + error_propagate(errp, local_err);
> + goto fail;
> + }
> +
> + s->sectorsize = BDRV_SECTOR_SIZE; /* May be updated later */
> + s->sectorbits = BDRV_SECTOR_BITS;
As mentioned above, you need to have the final values before the very
first write. The safe way to achieve this is to take it from options and
do away with bdrv_apply_blkconf().
> + s->cur_log_sector = 1;
> + s->nr_entries = 0;
Would it be useful to implement a mode that appends to the log?
In that case, you'd obviously use the sector size from the existing
superblock instead of allowing the user to specify something else.
> + /* Open the log file */
> + s->log_file = bdrv_open_child(NULL, options, "log", bs, &child_file, false,
> + &local_err);
> + if (local_err) {
> + ret = -EINVAL;
> + error_propagate(errp, local_err);
> + goto fail;
> + }
> +
> + ret = 0;
> +fail:
> + if (ret < 0) {
> + bdrv_unref_child(bs, bs->file);
> + bs->file = NULL;
> + }
> + return ret;
> +}
> +
> +static void blk_log_writes_close(BlockDriverState *bs)
> +{
> + BDRVBlkLogWritesState *s = bs->opaque;
> +
> + bdrv_unref_child(bs, s->log_file);
> + s->log_file = NULL;
> +}
> +
> +static int64_t blk_log_writes_getlength(BlockDriverState *bs)
> +{
> + return bdrv_getlength(bs->file->bs);
> +}
> +
> +static void blk_log_writes_refresh_filename(BlockDriverState *bs,
> + QDict *options)
> +{
> + BDRVBlkLogWritesState *s = bs->opaque;
> +
> + /* bs->file->bs has already been refreshed */
> + bdrv_refresh_filename(s->log_file->bs);
> +
> + if (bs->file->bs->full_open_options
> + && s->log_file->bs->full_open_options)
> + {
> + QDict *opts = qdict_new();
> + qdict_put_obj(opts, "driver",
> + QOBJECT(qstring_from_str("blklogwrites")));
qdict_put_str() makes this a bit simpler.
> +
> + qobject_ref(bs->file->bs->full_open_options);
> + qdict_put_obj(opts, "raw", QOBJECT(bs->file->bs->full_open_options));
> + qobject_ref(s->log_file->bs->full_open_options);
> + qdict_put_obj(opts, "log",
> + QOBJECT(s->log_file->bs->full_open_options));
> +
> + bs->full_open_options = opts;
> + }
> +
> + if (bs->file->bs->exact_filename[0]
> + && s->log_file->bs->exact_filename[0])
> + {
> + int ret = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
> + "blklogwrites:%s:%s",
I don't think a blklogwrites:X:Y syntax is actually supported, so we
shouldn't generate it here.
> + bs->file->bs->exact_filename,
> + s->log_file->bs->exact_filename);
> +
> + if (ret >= sizeof(bs->exact_filename)) {
> + /* An overflow makes the filename unusable, so do not report any */
> + bs->exact_filename[0] = '\0';
> + }
> + }
> +}
> +
> +static void blk_log_writes_child_perm(BlockDriverState *bs, BdrvChild *c,
> + const BdrvChildRole *role,
> + BlockReopenQueue *ro_q,
> + uint64_t perm, uint64_t shrd,
> + uint64_t *nperm, uint64_t *nshrd)
> +{
> + if (!c) {
> + *nperm = perm & DEFAULT_PERM_PASSTHROUGH;
> + *nshrd = (shrd & DEFAULT_PERM_PASSTHROUGH) | DEFAULT_PERM_UNCHANGED;
> + return;
> + }
> +
> + if (!strcmp(c->name, "log")) {
> + bdrv_format_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
> + } else {
> + bdrv_filter_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
> + }
> +}
> +
> +static void blk_log_writes_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> + if (bs->bl.request_alignment < BDRV_SECTOR_SIZE) {
> + bs->bl.request_alignment = BDRV_SECTOR_SIZE;
> +
> + if (bs->bl.pdiscard_alignment &&
> + bs->bl.pdiscard_alignment < bs->bl.request_alignment)
> + bs->bl.pdiscard_alignment = bs->bl.request_alignment;
> + if (bs->bl.pwrite_zeroes_alignment &&
> + bs->bl.pwrite_zeroes_alignment < bs->bl.request_alignment)
> + bs->bl.pwrite_zeroes_alignment = bs->bl.request_alignment;
You need braces in these if statements in the QEMU coding style.
These adjustments are probably not even actually necessary;
request_alignment should already be enough to enforce the right
alignment for discard/write_zeroes as well.
> + }
> +}
> +
> +static inline uint32_t blk_log_writes_log2(uint32_t value)
> +{
> + assert(value > 0);
> + return 31 - clz32(value);
> +}
> +
> +static void blk_log_writes_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
> +{
> + BDRVBlkLogWritesState *s = bs->opaque;
> + assert(bs && conf && conf->blk && s);
> +
> + s->sectorsize = conf->logical_block_size;
> + s->sectorbits = blk_log_writes_log2(s->sectorsize);
> + bs->bl.request_alignment = s->sectorsize;
> + if (conf->discard_granularity != (uint32_t)-1) {
> + bs->bl.pdiscard_alignment = conf->discard_granularity;
> + }
> +
> + if (bs->bl.pdiscard_alignment &&
> + bs->bl.pdiscard_alignment < bs->bl.request_alignment) {
> + bs->bl.pdiscard_alignment = bs->bl.request_alignment;
> + }
> + if (bs->bl.pwrite_zeroes_alignment &&
> + bs->bl.pwrite_zeroes_alignment < bs->bl.request_alignment) {
> + bs->bl.pwrite_zeroes_alignment = bs->bl.request_alignment;
> + }
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
> + QEMUIOVector *qiov, int flags)
> +{
> + return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
> +}
> +
> +typedef struct BlkLogWritesFileReq {
> + BlockDriverState *bs;
> + uint64_t offset;
> + uint64_t bytes;
> + int file_flags;
> + QEMUIOVector *qiov;
> + int (*func)(struct BlkLogWritesFileReq *r);
> + int file_ret;
> +} BlkLogWritesFileReq;
> +
> +typedef struct {
> + BlockDriverState *bs;
> + QEMUIOVector *qiov;
> + struct log_write_entry entry;
> + uint64_t zero_size;
> + int log_ret;
> +} BlkLogWritesLogReq;
> +
> +static void coroutine_fn blk_log_writes_co_do_log(BlkLogWritesLogReq *lr)
> +{
> + BDRVBlkLogWritesState *s = lr->bs->opaque;
> + uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;
> +
> + s->nr_entries++;
> + s->cur_log_sector +=
> + ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
> +
> + lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
> + lr->qiov, 0);
> +
> + /* Logging for the "write zeroes" operation */
> + if (lr->log_ret == 0 && lr->zero_size) {
> + cur_log_offset = s->cur_log_sector << s->sectorbits;
> + s->cur_log_sector +=
> + ROUND_UP(lr->zero_size, s->sectorsize) >> s->sectorbits;
> +
> + lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, cur_log_offset,
> + lr->zero_size, 0);
> + }
> +
> + /* Update super block on flush */
> + if (lr->log_ret == 0 && lr->entry.flags & LOG_FLUSH_FLAG) {
> + struct log_write_super super = {
> + .magic = cpu_to_le64(WRITE_LOG_MAGIC),
> + .version = cpu_to_le64(WRITE_LOG_VERSION),
> + .nr_entries = cpu_to_le64(s->nr_entries),
> + .sectorsize = cpu_to_le32(s->sectorsize),
> + };
> + void *zeroes = g_malloc0(s->sectorsize - sizeof(super));
> + QEMUIOVector qiov;
> +
> + qemu_iovec_init(&qiov, 2);
> + qemu_iovec_add(&qiov, &super, sizeof(super));
> + qemu_iovec_add(&qiov, zeroes, s->sectorsize - sizeof(super));
> +
> + lr->log_ret =
> + bdrv_co_pwritev(s->log_file, 0, s->sectorsize, &qiov, 0);
> + if (lr->log_ret == 0) {
> + lr->log_ret = bdrv_co_flush(s->log_file->bs);
> + }
> + qemu_iovec_destroy(&qiov);
> + g_free(zeroes);
> + }
> +}
> +
> +static void coroutine_fn blk_log_writes_co_do_file(BlkLogWritesFileReq *fr)
> +{
> + fr->file_ret = fr->func(fr);
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_log(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
> + QEMUIOVector *qiov, int flags,
> + int (*file_func)(BlkLogWritesFileReq *r),
> + uint64_t entry_flags, bool is_zero_write)
> +{
> + QEMUIOVector log_qiov;
> + size_t niov = qiov ? qiov->niov : 0;
> + size_t i;
> + BDRVBlkLogWritesState *s = bs->opaque;
> + BlkLogWritesFileReq fr = {
> + .bs = bs,
> + .offset = offset,
> + .bytes = bytes,
> + .file_flags = flags,
> + .qiov = qiov,
> + .func = file_func,
> + };
If you align values to the same column (which I appreciate), you should
do it for all of them. :-)
> + BlkLogWritesLogReq lr = {
> + .bs = bs,
> + .qiov = &log_qiov,
> + .entry = {
> + .sector = cpu_to_le64(offset >> s->sectorbits),
> + .nr_sectors = cpu_to_le64(bytes >> s->sectorbits),
> + .flags = cpu_to_le64(entry_flags),
> + .data_len = 0,
> + },
> + .zero_size = is_zero_write ? bytes : 0,
> + };
> + void *zeroes = g_malloc0(s->sectorsize - sizeof(lr.entry));
> +
> + assert((1 << s->sectorbits) == s->sectorsize);
> + assert(bs->bl.request_alignment == s->sectorsize);
> + assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
> + assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
> +
> + qemu_iovec_init(&log_qiov, niov + 2);
> + qemu_iovec_add(&log_qiov, &lr.entry, sizeof(lr.entry));
> + qemu_iovec_add(&log_qiov, zeroes, s->sectorsize - sizeof(lr.entry));
> + for (i = 0; i < niov; ++i) {
> + qemu_iovec_add(&log_qiov, qiov->iov[i].iov_base, qiov->iov[i].iov_len);
> + }
Maybe qemu_iovec_concat() would be a bit simpler?
> +
> + blk_log_writes_co_do_file(&fr);
> + blk_log_writes_co_do_log(&lr);
> +
> + qemu_iovec_destroy(&log_qiov);
> + g_free(zeroes);
> +
> + if (lr.log_ret < 0) {
> + return lr.log_ret;
> + }
> +
> + return fr.file_ret;
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_do_file_pwritev(BlkLogWritesFileReq *fr)
> +{
> + return bdrv_co_pwritev(fr->bs->file, fr->offset, fr->bytes,
> + fr->qiov, fr->file_flags);
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_do_file_pwrite_zeroes(BlkLogWritesFileReq *fr)
> +{
> + return bdrv_co_pwrite_zeroes(fr->bs->file, fr->offset, fr->bytes,
> + fr->file_flags);
> +}
> +
> +static int coroutine_fn blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr)
> +{
> + return bdrv_co_flush(fr->bs->file->bs);
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr)
> +{
> + return bdrv_co_pdiscard(fr->bs->file->bs, fr->offset, fr->bytes);
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
> + QEMUIOVector *qiov, int flags)
> +{
> + return blk_log_writes_co_log(bs, offset, bytes, qiov, flags,
> + blk_log_writes_co_do_file_pwritev, 0, false);
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
> + BdrvRequestFlags flags)
> +{
> + return blk_log_writes_co_log(bs, offset, bytes, NULL, flags,
> + blk_log_writes_co_do_file_pwrite_zeroes, 0,
> + true);
> +}
> +
> +static int coroutine_fn blk_log_writes_co_flush_to_disk(BlockDriverState *bs)
> +{
> + return blk_log_writes_co_log(bs, 0, 0, NULL, 0,
> + blk_log_writes_co_do_file_flush,
> + LOG_FLUSH_FLAG, false);
> +}
> +
> +static int coroutine_fn
> +blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
> +{
> + return blk_log_writes_co_log(bs, offset, count, NULL, 0,
> + blk_log_writes_co_do_file_pdiscard,
> + LOG_DISCARD_FLAG, false);
> +}
> +
> +static BlockDriver bdrv_blk_log_writes = {
> + .format_name = "blklogwrites",
> + .protocol_name = "blklogwrites",
This is for the blklogwrites:X:Y syntax, which is not supported, so it
should be removed.
> + .instance_size = sizeof(BDRVBlkLogWritesState),
> +
> + .bdrv_file_open = blk_log_writes_open,
This should be .bdrv_open.
> + .bdrv_close = blk_log_writes_close,
> + .bdrv_getlength = blk_log_writes_getlength,
> + .bdrv_refresh_filename = blk_log_writes_refresh_filename,
> + .bdrv_child_perm = blk_log_writes_child_perm,
> + .bdrv_refresh_limits = blk_log_writes_refresh_limits,
> + .bdrv_apply_blkconf = blk_log_writes_apply_blkconf,
> +
> + .bdrv_co_preadv = blk_log_writes_co_preadv,
> + .bdrv_co_pwritev = blk_log_writes_co_pwritev,
> + .bdrv_co_pwrite_zeroes = blk_log_writes_co_pwrite_zeroes,
> + .bdrv_co_flush_to_disk = blk_log_writes_co_flush_to_disk,
> + .bdrv_co_pdiscard = blk_log_writes_co_pdiscard,
> + .bdrv_co_block_status = bdrv_co_block_status_from_file,
> +
> + .is_filter = true,
> +};
Kevin
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/8] block: Add a mechanism for passing a block driver a block configuration
2018-06-29 10:07 ` Kevin Wolf
@ 2018-06-29 15:36 ` Ari Sundholm
0 siblings, 0 replies; 15+ messages in thread
From: Ari Sundholm @ 2018-06-29 15:36 UTC (permalink / raw)
To: Kevin Wolf
Cc: qemu-devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, John Snow,
open list:Block I/O path
On 06/29/2018 01:07 PM, Kevin Wolf wrote:
> Am 19.06.2018 um 15:43 hat Ari Sundholm geschrieben:
>> A block driver may need to know about the block configuration, most
>> critically the sector sizes, of a block backend for alignment purposes
>> or for some other reason.
>
> It makes sense to me that you need to know alignment constraints of the
> thing you are calling (i.e. your child nodes), but I don't really see
> why you would need to know anything about the parent nodes. I'm doubtful
> that this is a good thing to add.
>
I guess I should explain the exact use case here to show where we are
coming from.
The use case is the new blklogwrites driver (introduced in this patch
set), which we use to test file system implementations. It needs to log
the writes to whatever guest block device is used (for instance, a
-device scsi-hd). These block devices may have different kinds of
logical sector sizes, as determined by their configuration (such as
physical_block_size=4096,logical_block_size=512). In order to properly
log the writes that hit such a block device, we need to do this at a
granularity of its logical sector size for the log to be most useful.
For this, we need to get this information from the direction of the parent.
This patch set proposes a mechanism for exposing that information to
blklogwrites (and all other block drivers that may use this
information), along with other information in the block configuration,
to allow it to automatically adjust to it.
> I hope that the rest of the series will tell me why you are wanting this
> information, but if we keep it, it needs a better explanation in the
> commit message.
>
>> It doesn't seem like qemu has an existing
>> mechanism for a block backend to convey the required information to
>> the relevant block driver when it is being realized.
>>
>> The need for this mechanism rises from the fact that a drive may not
>> have a backend at the time it is created, as devices are created after
>> drives during qemu startup. Therefore, a driver requiring information
>> about the block configuration can get this information when a backend
>> is created for it at the earliest. The most natural place for this to
>> take place seems to be in the realization functions of the various
>> block device drivers, such as scsi-hd. The interface proposed here
>> allows the block driver to receive information about the block
>> configuration and the associated backend through a new callback.
>>
>> Signed-off-by: Ari Sundholm <ari@tuxera.com>
>
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 327e478..74c470e 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -465,6 +465,15 @@ struct BlockDriver {
>> void (*bdrv_abort_perm_update)(BlockDriverState *bs);
>>
>> /**
>> + * Called to inform the driver of the block configuration of the virtual
>> + * block device.
>> + *
>> + * This function is called by a block device realization function if the
>> + * device wants to inform the block driver of its block configuration.
>> + */
>> + void (*bdrv_apply_blkconf)(BlockDriverState *bs, BlockConf *conf);
>
> This interface can't really work. Block nodes (BlockDriverStates) can
> have an arbitrary number of parents, which can be attached and detached
> dynamically. This interface basically assumes that there is one device
> attached to the node, it will always stay attached and its configuration
> stays the same.
>
> Is this function supposed to be called only once? (That assumption
> wouldn't hold true.) If not, does a second call mean that a second
> parent was attached, or does it update the information of the existing
> parent?
>
> How is this information useful when one parent calls the function and
> provides BlockConf, but the other doesn't and the block driver doesn't
> know about this?
>
Yes, the interface needs work when you consider the general case. It
should be considered a first approximation towards a proper one. As it
is, it doesn't handle the removal of a block backend properly. Also, it
can be cumbersome to manage the block configurations provided by the
various parents and use them as constraints for the operation of the
block driver. It is very general as it is, though, as block drivers can
freely choose what to do with the block configuration.
I see you proposed dropping this interface entirely in your review of
patch 8. I think having an interface like this can be valuable, but it
seems that it requires more work, so I think I will move the definition
of a proper interface to a separate patch set and opt for manual
configuration of the log settings in blklogwrites, as you suggested.
>> diff --git a/block/io.c b/block/io.c
>> index b7beaee..3a06843 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -2932,3 +2932,25 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, uint64_t src_offset,
>> bdrv_dec_in_flight(dst_bs);
>> return ret;
>> }
>> +
>> +void bdrv_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
>> +{
>> + BlockDriver *drv = bs->drv;
>> +
>> + if (!drv) {
>> + return;
>> + }
>> +
>> + if (drv->bdrv_apply_blkconf) {
>> + drv->bdrv_apply_blkconf(bs, conf);
>> + return;
>> + }
>> +
>> + if (bs->file && bs->file->bs) {
>> + bdrv_apply_blkconf(bs->file->bs, conf);
>> + }
>> +
>> + if (bs->drv->supports_backing && backing_bs(bs)) {
>> + bdrv_apply_blkconf(backing_bs(bs), conf);
>> + }
>> +}
>
> You probably want to propagate to all children instead of singling out
> bs->file and bs->backing.
>
Oops, this was on my TODO list, but I apparently never did that change.
Thank you,
Ari Sundholm
ari@tuxera.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites
2018-06-29 12:05 ` Kevin Wolf
@ 2018-06-29 16:02 ` Ari Sundholm
2018-06-29 16:31 ` Kevin Wolf
0 siblings, 1 reply; 15+ messages in thread
From: Ari Sundholm @ 2018-06-29 16:02 UTC (permalink / raw)
To: Kevin Wolf
Cc: qemu-devel, Aapo Vienamo, Max Reitz, Markus Armbruster,
Eric Blake, open list:Block layer core
On 06/29/2018 03:05 PM, Kevin Wolf wrote:
> Am 19.06.2018 um 15:43 hat Ari Sundholm geschrieben:
>> From: Aapo Vienamo <aapo@tuxera.com>
>>
>> Implements a block device write logging system, similar to Linux kernel
>> device mapper dm-log-writes. The write operations that are performed
>> on a block device are logged to a file or another block device. The
>> write log format is identical to the dm-log-writes format. Currently,
>> log markers are not supported.
>>
>> This functionality can be used for crash consistency and fs consistency
>> testing. By implementing it in qemu, tests utilizing write logs can be
>> be used to test non-Linux drivers and older kernels.
>>
>> The driver requires all requests to be aligned to the logical sector
>> size of the block backend of the guest block device being logged. This
>> is important because the dm-log-writes log format has a granularity of
>> one sector for both the offset and the size of each write. Also, using
>> the logical sector size as a basis for logging allows for faithful
>> logging of writes when testing drivers with block devices that have
>> unusual sector sizes.
>
> This is backwards, the format of the image file can't depend on how the
> upper layer presents the image to the guest.
>
> In fact, logging the writes of an image format driver is likely to
> immediately corrupt the log when the guest device gets active because
> the format driver may already have written data to the image (with the
> default log sector size) before bdrv_apply_blkconf() was called, so you
> mix data of two different sector sizes, but the super block only records
> the latest one.
>
I see. I take it you're referring to a setup something like this:
-blockdev node-name=img_dev,driver=file,filename=foo.img \
-blockdev node-name=log_dev,driver=file,filename=foo.log \
-blockdev \
node-name=logged_hd,driver=blklogwrites,raw=img_dev,log=log_dev \
-blockdev node_name=hd,driver=qcow2,file=logged_hd \
-device virtio-scsi-pci \
-device \
scsi-hd,id=scsihd1,drive=hd,physical_block_size=4096,logical_block_size=512
In that case, things wouldn't work all that well, true. Automatic
detection of the sector size would be very counterproductive here.
> So I think you need to make the log sector size an option of the
> blklogwrites driver and set that as bs->bl.request_alignment in
> .bdrv_refresh_limits. 512 looks like a safe default which will even work
> when the guest uses a larger sector size; if the user chooses a log
> sector size that is larget than the guest device sector size, QEMU's
> block layer may need to internally perform a read-modify-write cycle.
> This is probably not very interesting for testing guest driver (where
> you'll want log sector size <= guest device sector size), but it can be
> useful to test QEMU itself.
>
OK, I will just make this an option of the blklogwrites driver for now
and we can think about automatic detection of the sector size as a
separate issue in a subsequent patch set.
>> The implementation is based on the blkverify and blkdebug block
>> drivers.
>>
>> Signed-off-by: Aapo Vienamo <aapo@tuxera.com>
>> Signed-off-by: Ari Sundholm <ari@tuxera.com>
>
>> --- /dev/null
>> +++ b/block/blklogwrites.c
>> @@ -0,0 +1,426 @@
>> +/*
>> + * Write logging blk driver based on blkverify and blkdebug.
>> + *
>> + * Copyright (c) 2017 Tuomas Tynkkynen <tuomas@tuxera.com>
>> + * Copyright (c) 2018 Aapo Vienamo <aapo@tuxera.com>
>> + * Copyright (c) 2018 Ari Sundholm <ari@tuxera.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
>> +#include "block/block_int.h"
>> +#include "qapi/qmp/qdict.h"
>> +#include "qapi/qmp/qstring.h"
>> +#include "qemu/cutils.h"
>> +#include "qemu/option.h"
>> +
>> +/* Disk format stuff - taken from Linux drivers/md/dm-log-writes.c */
>> +
>> +#define LOG_FLUSH_FLAG (1 << 0)
>> +#define LOG_FUA_FLAG (1 << 1)
>> +#define LOG_DISCARD_FLAG (1 << 2)
>> +#define LOG_MARK_FLAG (1 << 3)
>
> I'd align the values on the same column.
>
Will do, thanks.
>> +#define WRITE_LOG_VERSION 1ULL
>> +#define WRITE_LOG_MAGIC 0x6a736677736872ULL
>> +
>> +/* All fields are little-endian. */
>> +struct log_write_super {
>> + uint64_t magic;
>> + uint64_t version;
>> + uint64_t nr_entries;
>> + uint32_t sectorsize;
>> +};
>> +
>> +struct log_write_entry {
>> + uint64_t sector;
>> + uint64_t nr_sectors;
>> + uint64_t flags;
>> + uint64_t data_len;
>> +};
>
> It shouldn't make a difference semantically because the struct fields
> are naturally aligned, but may it's worth adding QEMU_PACKED for clarity
> to human readers?
>
Will do, thanks.
>> +/* End of disk format structures. */
>> +
>> +typedef struct {
>> + BdrvChild *log_file;
>> + uint32_t sectorsize;
>> + uint32_t sectorbits;
>> + uint64_t cur_log_sector;
>> + uint64_t nr_entries;
>> +} BDRVBlkLogWritesState;
>> +
>> +static int blk_log_writes_open(BlockDriverState *bs, QDict *options, int flags,
>> + Error **errp)
>> +{
>> + BDRVBlkLogWritesState *s = bs->opaque;
>> + Error *local_err = NULL;
>> + int ret;
>> +
>> + /* Open the raw file */
>> + bs->file = bdrv_open_child(NULL, options, "raw", bs, &child_file, false,
>> + &local_err);
>
> "file" might be a better name than "raw" because it's consistent with
> other drivers and we may in fact use a non-raw image format for this
> child.
>
Will do, thanks. This requires me to change the qapi bits, too (just a
reminder to myself).
>> + if (local_err) {
>> + ret = -EINVAL;
>> + error_propagate(errp, local_err);
>> + goto fail;
>> + }
>> +
>> + s->sectorsize = BDRV_SECTOR_SIZE; /* May be updated later */
>> + s->sectorbits = BDRV_SECTOR_BITS;
>
> As mentioned above, you need to have the final values before the very
> first write. The safe way to achieve this is to take it from options and
> do away with bdrv_apply_blkconf().
>
Will do, thanks.
>> + s->cur_log_sector = 1;
>> + s->nr_entries = 0;
>
> Would it be useful to implement a mode that appends to the log?
>
> In that case, you'd obviously use the sector size from the existing
> superblock instead of allowing the user to specify something else.
>
Such a mode may indeed be useful. Thank you for the idea. Would it be OK
to introduce this feature as a separate patch a bit later?
>> + /* Open the log file */
>> + s->log_file = bdrv_open_child(NULL, options, "log", bs, &child_file, false,
>> + &local_err);
>> + if (local_err) {
>> + ret = -EINVAL;
>> + error_propagate(errp, local_err);
>> + goto fail;
>> + }
>> +
>> + ret = 0;
>> +fail:
>> + if (ret < 0) {
>> + bdrv_unref_child(bs, bs->file);
>> + bs->file = NULL;
>> + }
>> + return ret;
>> +}
>> +
>> +static void blk_log_writes_close(BlockDriverState *bs)
>> +{
>> + BDRVBlkLogWritesState *s = bs->opaque;
>> +
>> + bdrv_unref_child(bs, s->log_file);
>> + s->log_file = NULL;
>> +}
>> +
>> +static int64_t blk_log_writes_getlength(BlockDriverState *bs)
>> +{
>> + return bdrv_getlength(bs->file->bs);
>> +}
>> +
>> +static void blk_log_writes_refresh_filename(BlockDriverState *bs,
>> + QDict *options)
>> +{
>> + BDRVBlkLogWritesState *s = bs->opaque;
>> +
>> + /* bs->file->bs has already been refreshed */
>> + bdrv_refresh_filename(s->log_file->bs);
>> +
>> + if (bs->file->bs->full_open_options
>> + && s->log_file->bs->full_open_options)
>> + {
>> + QDict *opts = qdict_new();
>> + qdict_put_obj(opts, "driver",
>> + QOBJECT(qstring_from_str("blklogwrites")));
>
> qdict_put_str() makes this a bit simpler.
>
Ah, indeed. Thanks.
>> +
>> + qobject_ref(bs->file->bs->full_open_options);
>> + qdict_put_obj(opts, "raw", QOBJECT(bs->file->bs->full_open_options));
>> + qobject_ref(s->log_file->bs->full_open_options);
>> + qdict_put_obj(opts, "log",
>> + QOBJECT(s->log_file->bs->full_open_options));
>> +
>> + bs->full_open_options = opts;
>> + }
>> +
>> + if (bs->file->bs->exact_filename[0]
>> + && s->log_file->bs->exact_filename[0])
>> + {
>> + int ret = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
>> + "blklogwrites:%s:%s",
>
> I don't think a blklogwrites:X:Y syntax is actually supported, so we
> shouldn't generate it here.
>
Will remove, thanks.
>> + bs->file->bs->exact_filename,
>> + s->log_file->bs->exact_filename);
>> +
>> + if (ret >= sizeof(bs->exact_filename)) {
>> + /* An overflow makes the filename unusable, so do not report any */
>> + bs->exact_filename[0] = '\0';
>> + }
>> + }
>> +}
>> +
>> +static void blk_log_writes_child_perm(BlockDriverState *bs, BdrvChild *c,
>> + const BdrvChildRole *role,
>> + BlockReopenQueue *ro_q,
>> + uint64_t perm, uint64_t shrd,
>> + uint64_t *nperm, uint64_t *nshrd)
>> +{
>> + if (!c) {
>> + *nperm = perm & DEFAULT_PERM_PASSTHROUGH;
>> + *nshrd = (shrd & DEFAULT_PERM_PASSTHROUGH) | DEFAULT_PERM_UNCHANGED;
>> + return;
>> + }
>> +
>> + if (!strcmp(c->name, "log")) {
>> + bdrv_format_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
>> + } else {
>> + bdrv_filter_default_perms(bs, c, role, ro_q, perm, shrd, nperm, nshrd);
>> + }
>> +}
>> +
>> +static void blk_log_writes_refresh_limits(BlockDriverState *bs, Error **errp)
>> +{
>> + if (bs->bl.request_alignment < BDRV_SECTOR_SIZE) {
>> + bs->bl.request_alignment = BDRV_SECTOR_SIZE;
>> +
>> + if (bs->bl.pdiscard_alignment &&
>> + bs->bl.pdiscard_alignment < bs->bl.request_alignment)
>> + bs->bl.pdiscard_alignment = bs->bl.request_alignment;
>> + if (bs->bl.pwrite_zeroes_alignment &&
>> + bs->bl.pwrite_zeroes_alignment < bs->bl.request_alignment)
>> + bs->bl.pwrite_zeroes_alignment = bs->bl.request_alignment;
>
> You need braces in these if statements in the QEMU coding style.
>
Oops, had somehow missed this. Checkpatch didn't complain, either.
> These adjustments are probably not even actually necessary;
> request_alignment should already be enough to enforce the right
> alignment for discard/write_zeroes as well.
>
Hmm, I guess you're right. Thanks.
>> + }
>> +}
>> +
>> +static inline uint32_t blk_log_writes_log2(uint32_t value)
>> +{
>> + assert(value > 0);
>> + return 31 - clz32(value);
>> +}
>> +
>> +static void blk_log_writes_apply_blkconf(BlockDriverState *bs, BlockConf *conf)
>> +{
>> + BDRVBlkLogWritesState *s = bs->opaque;
>> + assert(bs && conf && conf->blk && s);
>> +
>> + s->sectorsize = conf->logical_block_size;
>> + s->sectorbits = blk_log_writes_log2(s->sectorsize);
>> + bs->bl.request_alignment = s->sectorsize;
>> + if (conf->discard_granularity != (uint32_t)-1) {
>> + bs->bl.pdiscard_alignment = conf->discard_granularity;
>> + }
>> +
>> + if (bs->bl.pdiscard_alignment &&
>> + bs->bl.pdiscard_alignment < bs->bl.request_alignment) {
>> + bs->bl.pdiscard_alignment = bs->bl.request_alignment;
>> + }
>> + if (bs->bl.pwrite_zeroes_alignment &&
>> + bs->bl.pwrite_zeroes_alignment < bs->bl.request_alignment) {
>> + bs->bl.pwrite_zeroes_alignment = bs->bl.request_alignment;
>> + }
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>> + QEMUIOVector *qiov, int flags)
>> +{
>> + return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
>> +}
>> +
>> +typedef struct BlkLogWritesFileReq {
>> + BlockDriverState *bs;
>> + uint64_t offset;
>> + uint64_t bytes;
>> + int file_flags;
>> + QEMUIOVector *qiov;
>> + int (*func)(struct BlkLogWritesFileReq *r);
>> + int file_ret;
>> +} BlkLogWritesFileReq;
>> +
>> +typedef struct {
>> + BlockDriverState *bs;
>> + QEMUIOVector *qiov;
>> + struct log_write_entry entry;
>> + uint64_t zero_size;
>> + int log_ret;
>> +} BlkLogWritesLogReq;
>> +
>> +static void coroutine_fn blk_log_writes_co_do_log(BlkLogWritesLogReq *lr)
>> +{
>> + BDRVBlkLogWritesState *s = lr->bs->opaque;
>> + uint64_t cur_log_offset = s->cur_log_sector << s->sectorbits;
>> +
>> + s->nr_entries++;
>> + s->cur_log_sector +=
>> + ROUND_UP(lr->qiov->size, s->sectorsize) >> s->sectorbits;
>> +
>> + lr->log_ret = bdrv_co_pwritev(s->log_file, cur_log_offset, lr->qiov->size,
>> + lr->qiov, 0);
>> +
>> + /* Logging for the "write zeroes" operation */
>> + if (lr->log_ret == 0 && lr->zero_size) {
>> + cur_log_offset = s->cur_log_sector << s->sectorbits;
>> + s->cur_log_sector +=
>> + ROUND_UP(lr->zero_size, s->sectorsize) >> s->sectorbits;
>> +
>> + lr->log_ret = bdrv_co_pwrite_zeroes(s->log_file, cur_log_offset,
>> + lr->zero_size, 0);
>> + }
>> +
>> + /* Update super block on flush */
>> + if (lr->log_ret == 0 && lr->entry.flags & LOG_FLUSH_FLAG) {
>> + struct log_write_super super = {
>> + .magic = cpu_to_le64(WRITE_LOG_MAGIC),
>> + .version = cpu_to_le64(WRITE_LOG_VERSION),
>> + .nr_entries = cpu_to_le64(s->nr_entries),
>> + .sectorsize = cpu_to_le32(s->sectorsize),
>> + };
>> + void *zeroes = g_malloc0(s->sectorsize - sizeof(super));
>> + QEMUIOVector qiov;
>> +
>> + qemu_iovec_init(&qiov, 2);
>> + qemu_iovec_add(&qiov, &super, sizeof(super));
>> + qemu_iovec_add(&qiov, zeroes, s->sectorsize - sizeof(super));
>> +
>> + lr->log_ret =
>> + bdrv_co_pwritev(s->log_file, 0, s->sectorsize, &qiov, 0);
>> + if (lr->log_ret == 0) {
>> + lr->log_ret = bdrv_co_flush(s->log_file->bs);
>> + }
>> + qemu_iovec_destroy(&qiov);
>> + g_free(zeroes);
>> + }
>> +}
>> +
>> +static void coroutine_fn blk_log_writes_co_do_file(BlkLogWritesFileReq *fr)
>> +{
>> + fr->file_ret = fr->func(fr);
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_log(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>> + QEMUIOVector *qiov, int flags,
>> + int (*file_func)(BlkLogWritesFileReq *r),
>> + uint64_t entry_flags, bool is_zero_write)
>> +{
>> + QEMUIOVector log_qiov;
>> + size_t niov = qiov ? qiov->niov : 0;
>> + size_t i;
>> + BDRVBlkLogWritesState *s = bs->opaque;
>> + BlkLogWritesFileReq fr = {
>> + .bs = bs,
>> + .offset = offset,
>> + .bytes = bytes,
>> + .file_flags = flags,
>> + .qiov = qiov,
>> + .func = file_func,
>> + };
>
> If you align values to the same column (which I appreciate), you should
> do it for all of them. :-)
>
Sure. Thanks.
>> + BlkLogWritesLogReq lr = {
>> + .bs = bs,
>> + .qiov = &log_qiov,
>> + .entry = {
>> + .sector = cpu_to_le64(offset >> s->sectorbits),
>> + .nr_sectors = cpu_to_le64(bytes >> s->sectorbits),
>> + .flags = cpu_to_le64(entry_flags),
>> + .data_len = 0,
>> + },
>> + .zero_size = is_zero_write ? bytes : 0,
>> + };
>> + void *zeroes = g_malloc0(s->sectorsize - sizeof(lr.entry));
>> +
>> + assert((1 << s->sectorbits) == s->sectorsize);
>> + assert(bs->bl.request_alignment == s->sectorsize);
>> + assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
>> + assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
>> +
>> + qemu_iovec_init(&log_qiov, niov + 2);
>> + qemu_iovec_add(&log_qiov, &lr.entry, sizeof(lr.entry));
>> + qemu_iovec_add(&log_qiov, zeroes, s->sectorsize - sizeof(lr.entry));
>> + for (i = 0; i < niov; ++i) {
>> + qemu_iovec_add(&log_qiov, qiov->iov[i].iov_base, qiov->iov[i].iov_len);
>> + }
>
> Maybe qemu_iovec_concat() would be a bit simpler?
>
Indeed. Thanks.
>> +
>> + blk_log_writes_co_do_file(&fr);
>> + blk_log_writes_co_do_log(&lr);
>> +
>> + qemu_iovec_destroy(&log_qiov);
>> + g_free(zeroes);
>> +
>> + if (lr.log_ret < 0) {
>> + return lr.log_ret;
>> + }
>> +
>> + return fr.file_ret;
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_do_file_pwritev(BlkLogWritesFileReq *fr)
>> +{
>> + return bdrv_co_pwritev(fr->bs->file, fr->offset, fr->bytes,
>> + fr->qiov, fr->file_flags);
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_do_file_pwrite_zeroes(BlkLogWritesFileReq *fr)
>> +{
>> + return bdrv_co_pwrite_zeroes(fr->bs->file, fr->offset, fr->bytes,
>> + fr->file_flags);
>> +}
>> +
>> +static int coroutine_fn blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr)
>> +{
>> + return bdrv_co_flush(fr->bs->file->bs);
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr)
>> +{
>> + return bdrv_co_pdiscard(fr->bs->file->bs, fr->offset, fr->bytes);
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
>> + QEMUIOVector *qiov, int flags)
>> +{
>> + return blk_log_writes_co_log(bs, offset, bytes, qiov, flags,
>> + blk_log_writes_co_do_file_pwritev, 0, false);
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
>> + BdrvRequestFlags flags)
>> +{
>> + return blk_log_writes_co_log(bs, offset, bytes, NULL, flags,
>> + blk_log_writes_co_do_file_pwrite_zeroes, 0,
>> + true);
>> +}
>> +
>> +static int coroutine_fn blk_log_writes_co_flush_to_disk(BlockDriverState *bs)
>> +{
>> + return blk_log_writes_co_log(bs, 0, 0, NULL, 0,
>> + blk_log_writes_co_do_file_flush,
>> + LOG_FLUSH_FLAG, false);
>> +}
>> +
>> +static int coroutine_fn
>> +blk_log_writes_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
>> +{
>> + return blk_log_writes_co_log(bs, offset, count, NULL, 0,
>> + blk_log_writes_co_do_file_pdiscard,
>> + LOG_DISCARD_FLAG, false);
>> +}
>> +
>> +static BlockDriver bdrv_blk_log_writes = {
>> + .format_name = "blklogwrites",
>> + .protocol_name = "blklogwrites",
>
> This is for the blklogwrites:X:Y syntax, which is not supported, so it
> should be removed.
>
Just protocol_name, I assume? Will remove, thanks.
>> + .instance_size = sizeof(BDRVBlkLogWritesState),
>> +
>> + .bdrv_file_open = blk_log_writes_open,
>
> This should be .bdrv_open.
>
Ah. This is a remnant from the time we still had the old syntax. Will
change, thanks.
>> + .bdrv_close = blk_log_writes_close,
>> + .bdrv_getlength = blk_log_writes_getlength,
>> + .bdrv_refresh_filename = blk_log_writes_refresh_filename,
>> + .bdrv_child_perm = blk_log_writes_child_perm,
>> + .bdrv_refresh_limits = blk_log_writes_refresh_limits,
>> + .bdrv_apply_blkconf = blk_log_writes_apply_blkconf,
>> +
>> + .bdrv_co_preadv = blk_log_writes_co_preadv,
>> + .bdrv_co_pwritev = blk_log_writes_co_pwritev,
>> + .bdrv_co_pwrite_zeroes = blk_log_writes_co_pwrite_zeroes,
>> + .bdrv_co_flush_to_disk = blk_log_writes_co_flush_to_disk,
>> + .bdrv_co_pdiscard = blk_log_writes_co_pdiscard,
>> + .bdrv_co_block_status = bdrv_co_block_status_from_file,
>> +
>> + .is_filter = true,
>> +};
>
> Kevin
>
Thank you for the review,
Ari Sundholm
ari@tuxera.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites
2018-06-29 16:02 ` Ari Sundholm
@ 2018-06-29 16:31 ` Kevin Wolf
0 siblings, 0 replies; 15+ messages in thread
From: Kevin Wolf @ 2018-06-29 16:31 UTC (permalink / raw)
To: Ari Sundholm
Cc: qemu-devel, Aapo Vienamo, Max Reitz, Markus Armbruster,
Eric Blake, open list:Block layer core
Am 29.06.2018 um 18:02 hat Ari Sundholm geschrieben:
> On 06/29/2018 03:05 PM, Kevin Wolf wrote:
> > Am 19.06.2018 um 15:43 hat Ari Sundholm geschrieben:
> > > + s->cur_log_sector = 1;
> > > + s->nr_entries = 0;
> >
> > Would it be useful to implement a mode that appends to the log?
> >
> > In that case, you'd obviously use the sector size from the existing
> > superblock instead of allowing the user to specify something else.
> >
>
> Such a mode may indeed be useful. Thank you for the idea. Would it be OK to
> introduce this feature as a separate patch a bit later?
Yes, of course.
> > > +static BlockDriver bdrv_blk_log_writes = {
> > > + .format_name = "blklogwrites",
> > > + .protocol_name = "blklogwrites",
> >
> > This is for the blklogwrites:X:Y syntax, which is not supported, so it
> > should be removed.
> >
>
> Just protocol_name, I assume? Will remove, thanks.
Right, just protocol_name. format_name is always required.
Kevin
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-06-29 16:31 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-19 13:43 [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 1/8] block: Move two block permission constants to the relevant enum Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 2/8] block: Add a mechanism for passing a block driver a block configuration Ari Sundholm
2018-06-29 10:07 ` Kevin Wolf
2018-06-29 15:36 ` Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 3/8] hw/scsi/scsi-disk: Always apply block configuration to block driver Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 4/8] hw/ide/qdev: " Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 5/8] hw/block/virtio-blk: " Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 6/8] hw/block/nvme: " Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 7/8] hw/block/fdc: " Ari Sundholm
2018-06-19 13:43 ` [Qemu-devel] [PATCH v5 8/8] block: Add blklogwrites Ari Sundholm
2018-06-29 12:05 ` Kevin Wolf
2018-06-29 16:02 ` Ari Sundholm
2018-06-29 16:31 ` Kevin Wolf
2018-06-26 18:06 ` [Qemu-devel] [PATCH v5 0/8] New block driver: blklogwrites Ari Sundholm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).