From: Eric Blake <eblake@redhat.com>
To: Francesco Romani <fromani@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, mdroth@linux.vnet.ibm.com, stefanha@redhat.com,
lcapitulino@redhat.com
Subject: Re: [Qemu-devel] [PATCH] block: add watermark event
Date: Tue, 08 Jul 2014 09:10:57 -0600 [thread overview]
Message-ID: <53BC0A01.8020609@redhat.com> (raw)
In-Reply-To: <1404830964-10733-2-git-send-email-fromani@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3793 bytes --]
On 07/08/2014 08:49 AM, Francesco Romani wrote:
> Managing applications, like oVirt (http://www.ovirt.org), make extensive
> use of thin-provisioned disk images.
> In order to let the guest run flawlessly and be not unnecessarily
> paused, oVirt sets a watermark based on the percentage occupation of the
> device against the advertised size, and automatically resizes the image
> once the watermark is reached or exceeded.
>
> In order to detect the mark crossing, the managing application has no
> choice than aggressively polling the QEMU monitor using the
> query-blockstats command. This lead to unnecessary system
> load, and is made even worse under scale: scenarios
> with hundreds of VM are becoming not unusual.
>
> To fix this, this patch adds:
> * A new monitor command to set a mark for a given block device.
> * A new event to report if a block device usage exceeds the threshold.
>
> This will allow the managing application to drop the polling
> alltogether and just wait for a watermark crossing event.
s/alltogether/altogether/
>
> Signed-off-by: Francesco Romani <fromani@redhat.com>
> ---
> block.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++
> blockdev.c | 21 ++++++++++++++++++
> include/block/block.h | 2 ++
> include/block/block_int.h | 3 +++
> qapi/block-core.json | 33 ++++++++++++++++++++++++++++
> qmp-commands.hx | 24 ++++++++++++++++++++
> 6 files changed, 139 insertions(+)
>
> +void bdrv_set_watermark_perc(BlockDriverState *bs,
> + int watermark_perc)
> +{
> + NotifierWithReturn before_write = {
> + .notify = watermark_before_write_notify,
> + };
> +
> + if (watermark_perc <= 0) {
> + return;
> + }
Shouldn't you uninstall the write_notifier if someone is changing the
watermark_perc from non-zero back to zero?
> +++ b/include/block/block_int.h
> @@ -393,6 +393,9 @@ struct BlockDriverState {
>
> /* The error object in use for blocking operations on backing_hd */
> Error *backing_blocker;
> +
> + /* watermark limit for writes, percentage of sectors */
> + int wr_watermark_perc;
I didn't see any code that ensures that this limit is between 0 and 100;
since it is under user control, your code probably misbehaves for values
such as 101.
> +
> +##
> +# @BLOCK_WATERMARK
> +#
> +# Emitted when a block device reaches or exceeds the watermark limit.
> +#
> +# @device: device name
> +#
> +# @sector-num: number of the sector exceeding the threshold
> +#
> +# @sector-highest: number of the last highest written sector
> +#
> +# Since: 2.1
2.2. You've missed hard freeze for 2.1.
> +##
> +{ 'event': 'BLOCK_WATERMARK',
> + 'data': { 'device': 'str', 'sector-num': 'int', 'sector-highest': 'int' } }
> +
> +##
> +# @block_set_watermark
s/_/-/2 - new commands should use - in their name.
> +#
> +# Change watermark percentage for a block drive.
> +#
> +# @device: The name of the device
> +#
> +# @watermark: high water mark, percentage
Is percentage the right unit? On a 10G disk, that means you only have a
granularity down to 100M. Should the watermark limit instead be
expressed as a byte value instead of a percentage?
> +#
> +# Returns: Nothing on success
> +# If @device is not a valid block device, DeviceNotFound
> +#
> +# Since: 2.1
2.2.
> +##
> +{ 'command': 'block_set_watermark',
> + 'data': { 'device': 'str', 'watermark': 'int' } }
I hate write-only commands. Which query-* command are you modifying to
output the current watermark level?
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
next prev parent reply other threads:[~2014-07-08 15:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-08 14:49 [Qemu-devel] [PATCH] add watermark reporting for block devices Francesco Romani
2014-07-08 14:49 ` [Qemu-devel] [PATCH] block: add watermark event Francesco Romani
2014-07-08 15:10 ` Eric Blake [this message]
2014-08-01 11:39 ` Stefan Hajnoczi
2014-08-05 8:47 ` Kevin Wolf
2014-08-05 13:08 ` Stefan Hajnoczi
2014-08-08 8:01 ` Francesco Romani
2014-08-08 12:51 ` Eric Blake
2014-07-08 14:51 ` [Qemu-devel] [RFC] add watermark reporting for block devices Francesco Romani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53BC0A01.8020609@redhat.com \
--to=eblake@redhat.com \
--cc=fromani@redhat.com \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).