* [Qemu-devel] [PATCH] add watermark reporting for block devices
@ 2014-07-08 14:49 Francesco Romani
2014-07-08 14:49 ` [Qemu-devel] [PATCH] block: add watermark event Francesco Romani
2014-07-08 14:51 ` [Qemu-devel] [RFC] add watermark reporting for block devices Francesco Romani
0 siblings, 2 replies; 9+ messages in thread
From: Francesco Romani @ 2014-07-08 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Francesco Romani, mdroth, stefanha, lcapitulino
Hello everyone
I'm one of the oVirt developers (http://www.ovirt.org);
oVirt is a virtualization management application built
around qemu/kvm, so it is nice to get in touch :)
We have begun a big scalability improvement effort, aiming to
support without problems hundreds of VMs per host, with plans
to support thousands in a not so distant future.
In doing so, we are reviewing our usage flows.
One of them is thin-provisioned storage, which is used
quite extensively, with block devices (ISCSI for example)
and COW images.
When using thin provisioning, oVirt tries hard to hide this
fact from the guest OS, and to do so watches closely
the usage of the device, and resize it when its usage exceeds
a configured threshold (the "high water mark"), in order
to avoid the guest OS to get paused for space exhausted.
To do the watching, we poll he devices using libvirt
(virDomainGetBlockInfo), which in turn uses query-blockstats.
This is suboptimal with just one VM, but with hundereds of them,
let alone thousands, it doesn't scale and it is quite a resource
hog.
Would be great to have this watermark concept supported into qemu,
with a new event to be raised when the limit is crossed.
To track this RFE I filed https://bugs.launchpad.net/qemu/+bug/1338957
Moreover, I had the chance to take a look at the QEMU sources
and come up with this tentative patch which I'd also like
to submit.
Comments and thoughts very welcome!
Thanks and best regards,
Francesco Romani (1):
block: add watermark event
block.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++
blockdev.c | 21 ++++++++++++++++++
include/block/block.h | 2 ++
include/block/block_int.h | 3 +++
qapi/block-core.json | 33 ++++++++++++++++++++++++++++
qmp-commands.hx | 24 ++++++++++++++++++++
6 files changed, 139 insertions(+)
--
1.9.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH] block: add watermark event
2014-07-08 14:49 [Qemu-devel] [PATCH] add watermark reporting for block devices Francesco Romani
@ 2014-07-08 14:49 ` Francesco Romani
2014-07-08 15:10 ` Eric Blake
2014-08-01 11:39 ` Stefan Hajnoczi
2014-07-08 14:51 ` [Qemu-devel] [RFC] add watermark reporting for block devices Francesco Romani
1 sibling, 2 replies; 9+ messages in thread
From: Francesco Romani @ 2014-07-08 14:49 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Francesco Romani, mdroth, stefanha, lcapitulino
Managing applications, like oVirt (http://www.ovirt.org), make extensive
use of thin-provisioned disk images.
In order to let the guest run flawlessly and be not unnecessarily
paused, oVirt sets a watermark based on the percentage occupation of the
device against the advertised size, and automatically resizes the image
once the watermark is reached or exceeded.
In order to detect the mark crossing, the managing application has no
choice than aggressively polling the QEMU monitor using the
query-blockstats command. This lead to unnecessary system
load, and is made even worse under scale: scenarios
with hundreds of VM are becoming not unusual.
To fix this, this patch adds:
* A new monitor command to set a mark for a given block device.
* A new event to report if a block device usage exceeds the threshold.
This will allow the managing application to drop the polling
alltogether and just wait for a watermark crossing event.
Signed-off-by: Francesco Romani <fromani@redhat.com>
---
block.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++
blockdev.c | 21 ++++++++++++++++++
include/block/block.h | 2 ++
include/block/block_int.h | 3 +++
qapi/block-core.json | 33 ++++++++++++++++++++++++++++
qmp-commands.hx | 24 ++++++++++++++++++++
6 files changed, 139 insertions(+)
diff --git a/block.c b/block.c
index 8800a6b..cf34b7f 100644
--- a/block.c
+++ b/block.c
@@ -1983,6 +1983,8 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
bs_dest->device_list = bs_src->device_list;
memcpy(bs_dest->op_blockers, bs_src->op_blockers,
sizeof(bs_dest->op_blockers));
+
+ bs_dest->wr_watermark_perc = bs_src->wr_watermark_perc;
}
/*
@@ -5813,3 +5815,57 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
bdrv_flush_io_queue(bs->file);
}
}
+
+static bool watermark_exceeded(BlockDriverState *bs,
+ int64_t sector_num,
+ int nb_sectors)
+{
+
+ if (bs->wr_watermark_perc > 0) {
+ int64_t watermark = (bs->total_sectors) / 100 * bs->wr_watermark_perc;
+ if (sector_num >= watermark) {
+ return true;
+ }
+ }
+ return false;
+}
+
+static int coroutine_fn watermark_before_write_notify(NotifierWithReturn *notifier,
+ void *opaque)
+{
+ BdrvTrackedRequest *req = opaque;
+ int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
+ int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
+
+/* FIXME: needed? */
+ assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+ assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+
+ if (watermark_exceeded(req->bs, sector_num, nb_sectors)) {
+ BlockDriverState *bs = req->bs;
+ qapi_event_send_block_watermark(
+ bdrv_get_device_name(bs),
+ sector_num,
+ bs->wr_highest_sector,
+ &error_abort);
+ }
+
+ return 0; /* should always let other notifiers run */
+}
+
+void bdrv_set_watermark_perc(BlockDriverState *bs,
+ int watermark_perc)
+{
+ NotifierWithReturn before_write = {
+ .notify = watermark_before_write_notify,
+ };
+
+ if (watermark_perc <= 0) {
+ return;
+ }
+
+ if (bs->wr_watermark_perc == 0) {
+ bdrv_add_before_write_notifier(bs, &before_write);
+ }
+ bs->wr_watermark_perc = watermark_perc;
+}
diff --git a/blockdev.c b/blockdev.c
index 48bd9a3..ede21d9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2546,6 +2546,27 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
return dummy.next;
}
+void qmp_block_set_watermark(const char *device, int64_t watermark,
+ Error **errp)
+{
+ BlockDriverState *bs;
+ AioContext *aio_context;
+
+ bs = bdrv_find(device);
+ if (!bs) {
+ error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+ return;
+ }
+
+ aio_context = bdrv_get_aio_context(bs);
+ aio_context_acquire(aio_context);
+
+ bdrv_set_watermark_perc(bs, watermark);
+
+ aio_context_release(aio_context);
+}
+
+
QemuOptsList qemu_common_drive_opts = {
.name = "drive",
.head = QTAILQ_HEAD_INITIALIZER(qemu_common_drive_opts.head),
diff --git a/include/block/block.h b/include/block/block.h
index 32d3676..ff92ef9 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -588,4 +588,6 @@ void bdrv_io_plug(BlockDriverState *bs);
void bdrv_io_unplug(BlockDriverState *bs);
void bdrv_flush_io_queue(BlockDriverState *bs);
+void bdrv_set_watermark_perc(BlockDriverState *bs, int watermark_perc);
+
#endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f6c3bef..666ea1d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -393,6 +393,9 @@ struct BlockDriverState {
/* The error object in use for blocking operations on backing_hd */
Error *backing_blocker;
+
+ /* watermark limit for writes, percentage of sectors */
+ int wr_watermark_perc;
};
int get_tmp_filename(char *filename, int size);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index e378653..58e3b05 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1643,3 +1643,36 @@
'len' : 'int',
'offset': 'int',
'speed' : 'int' } }
+
+##
+# @BLOCK_WATERMARK
+#
+# Emitted when a block device reaches or exceeds the watermark limit.
+#
+# @device: device name
+#
+# @sector-num: number of the sector exceeding the threshold
+#
+# @sector-highest: number of the last highest written sector
+#
+# Since: 2.1
+##
+{ 'event': 'BLOCK_WATERMARK',
+ 'data': { 'device': 'str', 'sector-num': 'int', 'sector-highest': 'int' } }
+
+##
+# @block_set_watermark
+#
+# Change watermark percentage for a block drive.
+#
+# @device: The name of the device
+#
+# @watermark: high water mark, percentage
+#
+# Returns: Nothing on success
+# If @device is not a valid block device, DeviceNotFound
+#
+# Since: 2.1
+##
+{ 'command': 'block_set_watermark',
+ 'data': { 'device': 'str', 'watermark': 'int' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 4be4765..89fee40 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3755,3 +3755,27 @@ Example:
<- { "return": {} }
EQMP
+
+ {
+ .name = "block_set_watermark",
+ .args_type = "device:B,watermark:l",
+ .mhandler.cmd_new = qmp_marshal_input_block_set_watermark,
+ },
+
+SQMP
+block_set_watermark
+------------
+
+Change the high water mark for a block drive.
+
+Arguments:
+
+- "device": device name (json-string)
+- "watermark": the high water mark in percentage (json-int)
+
+Example:
+
+-> { "execute": "block_set_watermark", "arguments": { "device": "virtio0", "watermark": 75 } }
+<- { "return": {} }
+
+EQMP
--
1.9.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [RFC] add watermark reporting for block devices
2014-07-08 14:49 [Qemu-devel] [PATCH] add watermark reporting for block devices Francesco Romani
2014-07-08 14:49 ` [Qemu-devel] [PATCH] block: add watermark event Francesco Romani
@ 2014-07-08 14:51 ` Francesco Romani
1 sibling, 0 replies; 9+ messages in thread
From: Francesco Romani @ 2014-07-08 14:51 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, mdroth, stefanha, lcapitulino
Sorry, this is actually an RFC; patch was posted separately.
----- Original Message -----
> From: "Francesco Romani" <fromani@redhat.com>
> To: qemu-devel@nongnu.org
> Cc: kwolf@redhat.com, stefanha@redhat.com, lcapitulino@redhat.com, mdroth@linux.vnet.ibm.com, "Francesco Romani"
> <fromani@redhat.com>
> Sent: Tuesday, July 8, 2014 4:49:23 PM
> Subject: [PATCH] add watermark reporting for block devices
>
> Hello everyone
>
> I'm one of the oVirt developers (http://www.ovirt.org);
> oVirt is a virtualization management application built
> around qemu/kvm, so it is nice to get in touch :)
>
> We have begun a big scalability improvement effort, aiming to
> support without problems hundreds of VMs per host, with plans
> to support thousands in a not so distant future.
> In doing so, we are reviewing our usage flows.
>
> One of them is thin-provisioned storage, which is used
> quite extensively, with block devices (ISCSI for example)
> and COW images.
> When using thin provisioning, oVirt tries hard to hide this
> fact from the guest OS, and to do so watches closely
> the usage of the device, and resize it when its usage exceeds
> a configured threshold (the "high water mark"), in order
> to avoid the guest OS to get paused for space exhausted.
>
> To do the watching, we poll he devices using libvirt
> (virDomainGetBlockInfo), which in turn uses query-blockstats.
> This is suboptimal with just one VM, but with hundereds of them,
> let alone thousands, it doesn't scale and it is quite a resource
> hog.
>
> Would be great to have this watermark concept supported into qemu,
> with a new event to be raised when the limit is crossed.
>
> To track this RFE I filed https://bugs.launchpad.net/qemu/+bug/1338957
>
> Moreover, I had the chance to take a look at the QEMU sources
> and come up with this tentative patch which I'd also like
> to submit.
>
> Comments and thoughts very welcome!
>
> Thanks and best regards,
>
> Francesco Romani (1):
> block: add watermark event
>
> block.c | 56
> +++++++++++++++++++++++++++++++++++++++++++++++
> blockdev.c | 21 ++++++++++++++++++
> include/block/block.h | 2 ++
> include/block/block_int.h | 3 +++
> qapi/block-core.json | 33 ++++++++++++++++++++++++++++
> qmp-commands.hx | 24 ++++++++++++++++++++
> 6 files changed, 139 insertions(+)
>
> --
> 1.9.3
>
>
--
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] block: add watermark event
2014-07-08 14:49 ` [Qemu-devel] [PATCH] block: add watermark event Francesco Romani
@ 2014-07-08 15:10 ` Eric Blake
2014-08-01 11:39 ` Stefan Hajnoczi
1 sibling, 0 replies; 9+ messages in thread
From: Eric Blake @ 2014-07-08 15:10 UTC (permalink / raw)
To: Francesco Romani, qemu-devel; +Cc: kwolf, mdroth, stefanha, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 3793 bytes --]
On 07/08/2014 08:49 AM, Francesco Romani wrote:
> Managing applications, like oVirt (http://www.ovirt.org), make extensive
> use of thin-provisioned disk images.
> In order to let the guest run flawlessly and be not unnecessarily
> paused, oVirt sets a watermark based on the percentage occupation of the
> device against the advertised size, and automatically resizes the image
> once the watermark is reached or exceeded.
>
> In order to detect the mark crossing, the managing application has no
> choice than aggressively polling the QEMU monitor using the
> query-blockstats command. This lead to unnecessary system
> load, and is made even worse under scale: scenarios
> with hundreds of VM are becoming not unusual.
>
> To fix this, this patch adds:
> * A new monitor command to set a mark for a given block device.
> * A new event to report if a block device usage exceeds the threshold.
>
> This will allow the managing application to drop the polling
> alltogether and just wait for a watermark crossing event.
s/alltogether/altogether/
>
> Signed-off-by: Francesco Romani <fromani@redhat.com>
> ---
> block.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++
> blockdev.c | 21 ++++++++++++++++++
> include/block/block.h | 2 ++
> include/block/block_int.h | 3 +++
> qapi/block-core.json | 33 ++++++++++++++++++++++++++++
> qmp-commands.hx | 24 ++++++++++++++++++++
> 6 files changed, 139 insertions(+)
>
> +void bdrv_set_watermark_perc(BlockDriverState *bs,
> + int watermark_perc)
> +{
> + NotifierWithReturn before_write = {
> + .notify = watermark_before_write_notify,
> + };
> +
> + if (watermark_perc <= 0) {
> + return;
> + }
Shouldn't you uninstall the write_notifier if someone is changing the
watermark_perc from non-zero back to zero?
> +++ b/include/block/block_int.h
> @@ -393,6 +393,9 @@ struct BlockDriverState {
>
> /* The error object in use for blocking operations on backing_hd */
> Error *backing_blocker;
> +
> + /* watermark limit for writes, percentage of sectors */
> + int wr_watermark_perc;
I didn't see any code that ensures that this limit is between 0 and 100;
since it is under user control, your code probably misbehaves for values
such as 101.
> +
> +##
> +# @BLOCK_WATERMARK
> +#
> +# Emitted when a block device reaches or exceeds the watermark limit.
> +#
> +# @device: device name
> +#
> +# @sector-num: number of the sector exceeding the threshold
> +#
> +# @sector-highest: number of the last highest written sector
> +#
> +# Since: 2.1
2.2. You've missed hard freeze for 2.1.
> +##
> +{ 'event': 'BLOCK_WATERMARK',
> + 'data': { 'device': 'str', 'sector-num': 'int', 'sector-highest': 'int' } }
> +
> +##
> +# @block_set_watermark
s/_/-/2 - new commands should use - in their name.
> +#
> +# Change watermark percentage for a block drive.
> +#
> +# @device: The name of the device
> +#
> +# @watermark: high water mark, percentage
Is percentage the right unit? On a 10G disk, that means you only have a
granularity down to 100M. Should the watermark limit instead be
expressed as a byte value instead of a percentage?
> +#
> +# Returns: Nothing on success
> +# If @device is not a valid block device, DeviceNotFound
> +#
> +# Since: 2.1
2.2.
> +##
> +{ 'command': 'block_set_watermark',
> + 'data': { 'device': 'str', 'watermark': 'int' } }
I hate write-only commands. Which query-* command are you modifying to
output the current watermark level?
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] block: add watermark event
2014-07-08 14:49 ` [Qemu-devel] [PATCH] block: add watermark event Francesco Romani
2014-07-08 15:10 ` Eric Blake
@ 2014-08-01 11:39 ` Stefan Hajnoczi
2014-08-05 8:47 ` Kevin Wolf
1 sibling, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-08-01 11:39 UTC (permalink / raw)
To: Francesco Romani; +Cc: kwolf, mdroth, qemu-devel, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 2931 bytes --]
On Tue, Jul 08, 2014 at 04:49:24PM +0200, Francesco Romani wrote:
> @@ -5813,3 +5815,57 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
> bdrv_flush_io_queue(bs->file);
> }
> }
> +
> +static bool watermark_exceeded(BlockDriverState *bs,
> + int64_t sector_num,
> + int nb_sectors)
> +{
> +
> + if (bs->wr_watermark_perc > 0) {
> + int64_t watermark = (bs->total_sectors) / 100 * bs->wr_watermark_perc;
bs->total_sectors should not be used directly.
Have you considered making the watermark parameter take sector units
instead of a percentage?
I'm not sure whether a precentage makes sense because 25% of a 10GB
image is 2.5 GB so a 75% watermark might be reasonable. 25% of a 1 TB
image is 250 GB and that's probably not a reasonable watermark.
So let the block-set-watermark caller pass an absolute sector number
instead. It keeps things simple for both QEMU and thin provisioning
manager.
> + if (sector_num >= watermark) {
> + return true;
> + }
> + }
> + return false;
> +}
> +
> +static int coroutine_fn watermark_before_write_notify(NotifierWithReturn *notifier,
> + void *opaque)
> +{
> + BdrvTrackedRequest *req = opaque;
> + int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
> + int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
> +
> +/* FIXME: needed? */
> + assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> + assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
Not really needed here. Emulated storage controllers either get
requests in block units (i.e. they are automatically aligned) or check
them (like virtio-blk).
I guess there's no harm in checking, but I would drop it.
> +
> + if (watermark_exceeded(req->bs, sector_num, nb_sectors)) {
> + BlockDriverState *bs = req->bs;
> + qapi_event_send_block_watermark(
> + bdrv_get_device_name(bs),
> + sector_num,
> + bs->wr_highest_sector,
> + &error_abort);
How do you prevent flooding events if every write request exceeds the
watermark?
Perhaps the watermark should be disabled until block-set-watermark is
called again.
> + }
> +
> + return 0; /* should always let other notifiers run */
> +}
> +
> +void bdrv_set_watermark_perc(BlockDriverState *bs,
> + int watermark_perc)
> +{
> + NotifierWithReturn before_write = {
> + .notify = watermark_before_write_notify,
> + };
> +
> + if (watermark_perc <= 0) {
> + return;
> + }
> +
> + if (bs->wr_watermark_perc == 0) {
> + bdrv_add_before_write_notifier(bs, &before_write);
before_write must be a BlockDriverState field so it has the correct
lifetime. In this patch before_write is allocated on the stack and will
cause invalid memory accesses once we leave this function.
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] block: add watermark event
2014-08-01 11:39 ` Stefan Hajnoczi
@ 2014-08-05 8:47 ` Kevin Wolf
2014-08-05 13:08 ` Stefan Hajnoczi
0 siblings, 1 reply; 9+ messages in thread
From: Kevin Wolf @ 2014-08-05 8:47 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: mdroth, Francesco Romani, qemu-devel, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 3050 bytes --]
Am 01.08.2014 um 13:39 hat Stefan Hajnoczi geschrieben:
> On Tue, Jul 08, 2014 at 04:49:24PM +0200, Francesco Romani wrote:
> > @@ -5813,3 +5815,57 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
> > bdrv_flush_io_queue(bs->file);
> > }
> > }
> > +
> > +static bool watermark_exceeded(BlockDriverState *bs,
> > + int64_t sector_num,
> > + int nb_sectors)
> > +{
> > +
> > + if (bs->wr_watermark_perc > 0) {
> > + int64_t watermark = (bs->total_sectors) / 100 * bs->wr_watermark_perc;
>
> bs->total_sectors should not be used directly.
>
> Have you considered making the watermark parameter take sector units
> instead of a percentage?
>
> I'm not sure whether a precentage makes sense because 25% of a 10GB
> image is 2.5 GB so a 75% watermark might be reasonable. 25% of a 1 TB
> image is 250 GB and that's probably not a reasonable watermark.
>
> So let the block-set-watermark caller pass an absolute sector number
> instead. It keeps things simple for both QEMU and thin provisioning
> manager.
No sector numbers in external interfaces, please. These units of 512
bytes are completely arbitrary and don't make any sense. I hope to get
rid of BDRV_SECTOR_* eventually even internally.
So for external APIs, please use bytes instead.
> > + if (sector_num >= watermark) {
> > + return true;
> > + }
> > + }
> > + return false;
> > +}
> > +
> > +static int coroutine_fn watermark_before_write_notify(NotifierWithReturn *notifier,
> > + void *opaque)
> > +{
> > + BdrvTrackedRequest *req = opaque;
> > + int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
> > + int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
> > +
> > +/* FIXME: needed? */
> > + assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
> > + assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
>
> Not really needed here. Emulated storage controllers either get
> requests in block units (i.e. they are automatically aligned) or check
> them (like virtio-blk).
But we don't only get requests from emulated storage controllers. You
cannot reasonably assume anything here (the requests are aligned
according to the requirements of the backend, but that may just be 1).
Again, do your calculations in a byte granularity to be safe here.
> I guess there's no harm in checking, but I would drop it.
>
> > +
> > + if (watermark_exceeded(req->bs, sector_num, nb_sectors)) {
> > + BlockDriverState *bs = req->bs;
> > + qapi_event_send_block_watermark(
> > + bdrv_get_device_name(bs),
> > + sector_num,
> > + bs->wr_highest_sector,
> > + &error_abort);
>
> How do you prevent flooding events if every write request exceeds the
> watermark?
>
> Perhaps the watermark should be disabled until block-set-watermark is
> called again.
I agree.
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] block: add watermark event
2014-08-05 8:47 ` Kevin Wolf
@ 2014-08-05 13:08 ` Stefan Hajnoczi
2014-08-08 8:01 ` Francesco Romani
0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-08-05 13:08 UTC (permalink / raw)
To: Kevin Wolf; +Cc: mdroth, Francesco Romani, qemu-devel, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 1580 bytes --]
On Tue, Aug 05, 2014 at 10:47:57AM +0200, Kevin Wolf wrote:
> Am 01.08.2014 um 13:39 hat Stefan Hajnoczi geschrieben:
> > On Tue, Jul 08, 2014 at 04:49:24PM +0200, Francesco Romani wrote:
> > > @@ -5813,3 +5815,57 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
> > > bdrv_flush_io_queue(bs->file);
> > > }
> > > }
> > > +
> > > +static bool watermark_exceeded(BlockDriverState *bs,
> > > + int64_t sector_num,
> > > + int nb_sectors)
> > > +{
> > > +
> > > + if (bs->wr_watermark_perc > 0) {
> > > + int64_t watermark = (bs->total_sectors) / 100 * bs->wr_watermark_perc;
> >
> > bs->total_sectors should not be used directly.
> >
> > Have you considered making the watermark parameter take sector units
> > instead of a percentage?
> >
> > I'm not sure whether a precentage makes sense because 25% of a 10GB
> > image is 2.5 GB so a 75% watermark might be reasonable. 25% of a 1 TB
> > image is 250 GB and that's probably not a reasonable watermark.
> >
> > So let the block-set-watermark caller pass an absolute sector number
> > instead. It keeps things simple for both QEMU and thin provisioning
> > manager.
>
> No sector numbers in external interfaces, please. These units of 512
> bytes are completely arbitrary and don't make any sense. I hope to get
> rid of BDRV_SECTOR_* eventually even internally.
>
> So for external APIs, please use bytes instead.
I agree and forgot about that. Please use bytes instead of sectors or a
percentage.
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] block: add watermark event
2014-08-05 13:08 ` Stefan Hajnoczi
@ 2014-08-08 8:01 ` Francesco Romani
2014-08-08 12:51 ` Eric Blake
0 siblings, 1 reply; 9+ messages in thread
From: Francesco Romani @ 2014-08-08 8:01 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Kevin Wolf, lcapitulino, mdroth, qemu-devel
----- Original Message -----
> From: "Stefan Hajnoczi" <stefanha@redhat.com>
> To: "Kevin Wolf" <kwolf@redhat.com>
> Cc: mdroth@linux.vnet.ibm.com, "Francesco Romani" <fromani@redhat.com>, qemu-devel@nongnu.org, lcapitulino@redhat.com
> Sent: Tuesday, August 5, 2014 3:08:46 PM
> Subject: Re: [Qemu-devel] [PATCH] block: add watermark event
>
> On Tue, Aug 05, 2014 at 10:47:57AM +0200, Kevin Wolf wrote:
> > Am 01.08.2014 um 13:39 hat Stefan Hajnoczi geschrieben:
> > > On Tue, Jul 08, 2014 at 04:49:24PM +0200, Francesco Romani wrote:
> > > > @@ -5813,3 +5815,57 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
> > > > bdrv_flush_io_queue(bs->file);
> > > > }
> > > > }
> > > > +
> > > > +static bool watermark_exceeded(BlockDriverState *bs,
> > > > + int64_t sector_num,
> > > > + int nb_sectors)
> > > > +{
> > > > +
> > > > + if (bs->wr_watermark_perc > 0) {
> > > > + int64_t watermark = (bs->total_sectors) / 100 *
> > > > bs->wr_watermark_perc;
> > >
> > > bs->total_sectors should not be used directly.
> > >
> > > Have you considered making the watermark parameter take sector units
> > > instead of a percentage?
> > >
> > > I'm not sure whether a precentage makes sense because 25% of a 10GB
> > > image is 2.5 GB so a 75% watermark might be reasonable. 25% of a 1 TB
> > > image is 250 GB and that's probably not a reasonable watermark.
> > >
> > > So let the block-set-watermark caller pass an absolute sector number
> > > instead. It keeps things simple for both QEMU and thin provisioning
> > > manager.
> >
> > No sector numbers in external interfaces, please. These units of 512
> > bytes are completely arbitrary and don't make any sense. I hope to get
> > rid of BDRV_SECTOR_* eventually even internally.
> >
> > So for external APIs, please use bytes instead.
>
> I agree and forgot about that. Please use bytes instead of sectors or a
> percentage.
>
Thanks everyone for the great feedback received!
I'll post asap a new patch addressing all the comments.
I'll also change the name because 'watermark' may be misleading/wrong jargon :)
(http://en.wikipedia.org/wiki/Watermark)
Bests,
--
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] block: add watermark event
2014-08-08 8:01 ` Francesco Romani
@ 2014-08-08 12:51 ` Eric Blake
0 siblings, 0 replies; 9+ messages in thread
From: Eric Blake @ 2014-08-08 12:51 UTC (permalink / raw)
To: Francesco Romani, Stefan Hajnoczi
Cc: Kevin Wolf, qemu-devel, mdroth, lcapitulino
[-- Attachment #1: Type: text/plain, Size: 995 bytes --]
On 08/08/2014 02:01 AM, Francesco Romani wrote:
>>>> So let the block-set-watermark caller pass an absolute sector number
>>>> instead. It keeps things simple for both QEMU and thin provisioning
>>>> manager.
>>>
>>> No sector numbers in external interfaces, please. These units of 512
>>> bytes are completely arbitrary and don't make any sense. I hope to get
>>> rid of BDRV_SECTOR_* eventually even internally.
>>>
>>> So for external APIs, please use bytes instead.
>>
>> I agree and forgot about that. Please use bytes instead of sectors or a
>> percentage.
>>
>
> Thanks everyone for the great feedback received!
>
> I'll post asap a new patch addressing all the comments.
>
> I'll also change the name because 'watermark' may be misleading/wrong jargon :)
> (http://en.wikipedia.org/wiki/Watermark)
Is "threshold" a better name than "watermark"?
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 539 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-08-08 12:51 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-08 14:49 [Qemu-devel] [PATCH] add watermark reporting for block devices Francesco Romani
2014-07-08 14:49 ` [Qemu-devel] [PATCH] block: add watermark event Francesco Romani
2014-07-08 15:10 ` Eric Blake
2014-08-01 11:39 ` Stefan Hajnoczi
2014-08-05 8:47 ` Kevin Wolf
2014-08-05 13:08 ` Stefan Hajnoczi
2014-08-08 8:01 ` Francesco Romani
2014-08-08 12:51 ` Eric Blake
2014-07-08 14:51 ` [Qemu-devel] [RFC] add watermark reporting for block devices Francesco Romani
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).