qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Adam Litke <agl@us.ibm.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>,
	Luiz Capitulino <lcapitulino@redhat.com>,
	qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] virtio: Add memory statistics reporting to the balloon driver (V8)
Date: Tue, 26 Jan 2010 18:08:02 -0600	[thread overview]
Message-ID: <4B5F83E2.1060109@codemonkey.ws> (raw)
In-Reply-To: <1264537055.2890.40.camel@aglitke>

On 01/26/2010 02:17 PM, Adam Litke wrote:
> The changes in V8 of this patch are related to the monitor infrastructure.  No
> changes to the virtio interface core have been made since V4.  This is intended
> to apply on top of my API for asynchronous monitor commands patch.
>
> Changes since V7:
>   - Ported to the asynchronous monitor API
>
> Changes since V6:
>   - Integrated with virtio qdev feature bit changes
>     (specifically: Use VirtIODevice 'guest_features' to check if memory stats
>     is a negotiated feature)
>   - Track which monitor initiated the most recent stats request.  Now it does the
>     Right Thing(tm) with multiple monitors making parallel requests.
>
> Changes since V5:
>   - Asynchronous query-balloon mode for QMP
>   - Add timeout to prevent hanging the user monitor in synchronous mode
>
> Changes since V4:
>   - Virtio spec updated: http://ozlabs.org/~rusty/virtio-spec/virtio-spec-0.8.2.pdf
>   - Guest-side Linux implementation applied by Rusty
>   - Start using the QObject infrastructure
>   - All endian conversions done in the host
>   - Report stats that reference a quantity of memory in bytes
>
> Changes since V3:
>   - Increase stat field size to 64 bits
>   - Report all sizes in kb (not pages)
>   - Drop anon_pages stat
>
> Changes since V2:
>   - Use a virtqueue for communication instead of the device config space
>
> Changes since V1:
>   - In the monitor, print all stats on one line with less abbreviated names
>   - Coding style changes
>
> When using ballooning to manage overcommitted memory on a host, a system for
> guests to communicate their memory usage to the host can provide information
> that will minimize the impact of ballooning on the guests.  The current method
> employs a daemon running in each guest that communicates memory statistics to a
> host daemon at a specified time interval.  The host daemon aggregates this
> information and inflates and/or deflates balloons according to the level of
> host memory pressure.  This approach is effective but overly complex since a
> daemon must be installed inside each guest and coordinated to communicate with
> the host.  A simpler approach is to collect memory statistics in the virtio
> balloon driver and communicate them directly to the hypervisor.
>
> Signed-off-by: Adam Litke<agl@us.ibm.com>
> To: Anthony Liguori<aliguori@us.ibm.com>
> Cc: Avi Kivity<avi@redhat.com>
> Cc: Luiz Capitulino<lcapitulino@redhat.com>
> Cc: qemu-devel@nongnu.org
>    

Applied.  Thanks.

Regards,

Anthony Liguori
> diff --git a/balloon.h b/balloon.h
> index 60b4a5d..c3a1ad3 100644
> --- a/balloon.h
> +++ b/balloon.h
> @@ -16,12 +16,13 @@
>
>   #include "cpu-defs.h"
>
> -typedef ram_addr_t (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
> +typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target,
> +                                MonitorCompletion cb, void *cb_data);
>
>   void qemu_add_balloon_handler(QEMUBalloonEvent *func, void *opaque);
>
> -void qemu_balloon(ram_addr_t target);
> +int qemu_balloon(ram_addr_t target, MonitorCompletion cb, void *opaque);
>
> -ram_addr_t qemu_balloon_status(void);
> +int qemu_balloon_status(MonitorCompletion cb, void *opaque);
>
>   #endif
> diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c
> index e17880f..086d9d1 100644
> --- a/hw/virtio-balloon.c
> +++ b/hw/virtio-balloon.c
> @@ -16,9 +16,13 @@
>   #include "pc.h"
>   #include "sysemu.h"
>   #include "cpu.h"
> +#include "monitor.h"
>   #include "balloon.h"
>   #include "virtio-balloon.h"
>   #include "kvm.h"
> +#include "qlist.h"
> +#include "qint.h"
> +#include "qstring.h"
>
>   #if defined(__linux__)
>   #include<sys/mman.h>
> @@ -27,9 +31,14 @@
>   typedef struct VirtIOBalloon
>   {
>       VirtIODevice vdev;
> -    VirtQueue *ivq, *dvq;
> +    VirtQueue *ivq, *dvq, *svq;
>       uint32_t num_pages;
>       uint32_t actual;
> +    uint64_t stats[VIRTIO_BALLOON_S_NR];
> +    VirtQueueElement stats_vq_elem;
> +    size_t stats_vq_offset;
> +    MonitorCompletion *stats_callback;
> +    void *stats_opaque_callback_data;
>   } VirtIOBalloon;
>
>   static VirtIOBalloon *to_virtio_balloon(VirtIODevice *vdev)
> @@ -46,6 +55,42 @@ static void balloon_page(void *addr, int deflate)
>   #endif
>   }
>
> +/*
> + * reset_stats - Mark all items in the stats array as unset
> + *
> + * This function needs to be called at device intialization and before
> + * before updating to a set of newly-generated stats.  This will ensure that no
> + * stale values stick around in case the guest reports a subset of the supported
> + * statistics.
> + */
> +static inline void reset_stats(VirtIOBalloon *dev)
> +{
> +    int i;
> +    for (i = 0; i<  VIRTIO_BALLOON_S_NR; dev->stats[i++] = -1);
> +}
> +
> +static void stat_put(QDict *dict, const char *label, uint64_t val)
> +{
> +    if (val != -1)
> +        qdict_put(dict, label, qint_from_int(val));
> +}
> +
> +static QObject *get_stats_qobject(VirtIOBalloon *dev)
> +{
> +    QDict *dict = qdict_new();
> +    uint32_t actual = ram_size - (dev->actual<<  VIRTIO_BALLOON_PFN_SHIFT);
> +
> +    stat_put(dict, "actual", actual);
> +    stat_put(dict, "mem_swapped_in", dev->stats[VIRTIO_BALLOON_S_SWAP_IN]);
> +    stat_put(dict, "mem_swapped_out", dev->stats[VIRTIO_BALLOON_S_SWAP_OUT]);
> +    stat_put(dict, "major_page_faults", dev->stats[VIRTIO_BALLOON_S_MAJFLT]);
> +    stat_put(dict, "minor_page_faults", dev->stats[VIRTIO_BALLOON_S_MINFLT]);
> +    stat_put(dict, "free_mem", dev->stats[VIRTIO_BALLOON_S_MEMFREE]);
> +    stat_put(dict, "total_mem", dev->stats[VIRTIO_BALLOON_S_MEMTOT]);
> +
> +    return QOBJECT(dict);
> +}
> +
>   /* FIXME: once we do a virtio refactoring, this will get subsumed into common
>    * code */
>   static size_t memcpy_from_iovector(void *data, size_t offset, size_t size,
> @@ -104,6 +149,51 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>       }
>   }
>
> +static void complete_stats_request(VirtIOBalloon *vb)
> +{
> +    QObject *stats;
> +
> +    if (!vb->stats_opaque_callback_data)
> +        return;
> +
> +    stats = get_stats_qobject(vb);
> +    vb->stats_callback(vb->stats_opaque_callback_data, stats);
> +    qobject_decref(stats);
> +    vb->stats_opaque_callback_data = NULL;
> +    vb->stats_callback = NULL;
> +}
> +
> +static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    VirtIOBalloon *s = DO_UPCAST(VirtIOBalloon, vdev, vdev);
> +    VirtQueueElement *elem =&s->stats_vq_elem;
> +    VirtIOBalloonStat stat;
> +    size_t offset = 0;
> +
> +    if (!virtqueue_pop(vq, elem)) {
> +        return;
> +    }
> +
> +    /* Initialize the stats to get rid of any stale values.  This is only
> +     * needed to handle the case where a guest supports fewer stats than it
> +     * used to (ie. it has booted into an old kernel).
> +     */
> +    reset_stats(s);
> +
> +    while (memcpy_from_iovector(&stat, offset, sizeof(stat), elem->out_sg,
> +                                elem->out_num) == sizeof(stat)) {
> +        uint16_t tag = tswap16(stat.tag);
> +        uint64_t val = tswap64(stat.val);
> +
> +        offset += sizeof(stat);
> +        if (tag<  VIRTIO_BALLOON_S_NR)
> +            s->stats[tag] = val;
> +    }
> +    s->stats_vq_offset = offset;
> +
> +    complete_stats_request(s);
> +}
> +
>   static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>   {
>       VirtIOBalloon *dev = to_virtio_balloon(vdev);
> @@ -126,10 +216,12 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
>
>   static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f)
>   {
> +    f |= (1<<  VIRTIO_BALLOON_F_STATS_VQ);
>       return f;
>   }
>
> -static ram_addr_t virtio_balloon_to_target(void *opaque, ram_addr_t target)
> +static void virtio_balloon_to_target(void *opaque, ram_addr_t target,
> +                                     MonitorCompletion cb, void *cb_data)
>   {
>       VirtIOBalloon *dev = opaque;
>
> @@ -139,9 +231,26 @@ static ram_addr_t virtio_balloon_to_target(void *opaque, ram_addr_t target)
>       if (target) {
>           dev->num_pages = (ram_size - target)>>  VIRTIO_BALLOON_PFN_SHIFT;
>           virtio_notify_config(&dev->vdev);
> +    } else {
> +        /* For now, only allow one request at a time.  This restriction can be
> +         * removed later by queueing callback and data pairs.
> +         */
> +        if (dev->stats_callback != NULL) {
> +            return;
> +        }
> +        dev->stats_callback = cb;
> +        dev->stats_opaque_callback_data = cb_data;
> +        if (dev->vdev.guest_features&  (1<<  VIRTIO_BALLOON_F_STATS_VQ)) {
> +            virtqueue_push(dev->svq,&dev->stats_vq_elem, dev->stats_vq_offset);
> +            virtio_notify(&dev->vdev, dev->svq);
> +        } else {
> +            /* Stats are not supported.  Clear out any stale values that might
> +             * have been set by a more featureful guest kernel.
> +             */
> +            reset_stats(dev);
> +            complete_stats_request(dev);
> +        }
>       }
> -
> -    return ram_size - (dev->actual<<  VIRTIO_BALLOON_PFN_SHIFT);
>   }
>
>   static void virtio_balloon_save(QEMUFile *f, void *opaque)
> @@ -152,6 +261,10 @@ static void virtio_balloon_save(QEMUFile *f, void *opaque)
>
>       qemu_put_be32(f, s->num_pages);
>       qemu_put_be32(f, s->actual);
> +    qemu_put_buffer(f, (uint8_t *)&s->stats_vq_elem, sizeof(VirtQueueElement));
> +    qemu_put_buffer(f, (uint8_t *)&s->stats_vq_offset, sizeof(size_t));
> +    qemu_put_buffer(f, (uint8_t *)&s->stats_callback, sizeof(MonitorCompletion));
> +    qemu_put_buffer(f, (uint8_t *)&s->stats_opaque_callback_data, sizeof(void));
>   }
>
>   static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id)
> @@ -165,6 +278,10 @@ static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id)
>
>       s->num_pages = qemu_get_be32(f);
>       s->actual = qemu_get_be32(f);
> +    qemu_get_buffer(f, (uint8_t *)&s->stats_vq_elem, sizeof(VirtQueueElement));
> +    qemu_get_buffer(f, (uint8_t *)&s->stats_vq_offset, sizeof(size_t));
> +    qemu_get_buffer(f, (uint8_t *)&s->stats_callback, sizeof(MonitorCompletion));
> +    qemu_get_buffer(f, (uint8_t *)&s->stats_opaque_callback_data, sizeof(void));
>
>       return 0;
>   }
> @@ -183,7 +300,9 @@ VirtIODevice *virtio_balloon_init(DeviceState *dev)
>
>       s->ivq = virtio_add_queue(&s->vdev, 128, virtio_balloon_handle_output);
>       s->dvq = virtio_add_queue(&s->vdev, 128, virtio_balloon_handle_output);
> +    s->svq = virtio_add_queue(&s->vdev, 128, virtio_balloon_receive_stats);
>
> +    reset_stats(s);
>       qemu_add_balloon_handler(virtio_balloon_to_target, s);
>
>       register_savevm("virtio-balloon", -1, 1, virtio_balloon_save, virtio_balloon_load, s);
> diff --git a/hw/virtio-balloon.h b/hw/virtio-balloon.h
> index 9a0d119..e20cf6b 100644
> --- a/hw/virtio-balloon.h
> +++ b/hw/virtio-balloon.h
> @@ -25,6 +25,7 @@
>
>   /* The feature bitmap for virtio balloon */
>   #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
> +#define VIRTIO_BALLOON_F_STATS_VQ 1       /* Memory stats virtqueue */
>
>   /* Size of a PFN in the balloon interface. */
>   #define VIRTIO_BALLOON_PFN_SHIFT 12
> @@ -37,4 +38,18 @@ struct virtio_balloon_config
>       uint32_t actual;
>   };
>
> +/* Memory Statistics */
> +#define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
> +#define VIRTIO_BALLOON_S_SWAP_OUT 1   /* Amount of memory swapped out */
> +#define VIRTIO_BALLOON_S_MAJFLT   2   /* Number of major faults */
> +#define VIRTIO_BALLOON_S_MINFLT   3   /* Number of minor faults */
> +#define VIRTIO_BALLOON_S_MEMFREE  4   /* Total amount of free memory */
> +#define VIRTIO_BALLOON_S_MEMTOT   5   /* Total amount of memory */
> +#define VIRTIO_BALLOON_S_NR       6
> +
> +typedef struct VirtIOBalloonStat {
> +    uint16_t tag;
> +    uint64_t val;
> +} __attribute__((packed)) VirtIOBalloonStat;
> +
>   #endif
> diff --git a/monitor.c b/monitor.c
> index 58cd02c..43a88fd 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -2154,33 +2154,13 @@ static void do_info_status(Monitor *mon, QObject **ret_data)
>                                       vm_running, singlestep);
>   }
>
> -static ram_addr_t balloon_get_value(void)
> +static void print_balloon_stat(const char *key, QObject *obj, void *opaque)
>   {
> -    ram_addr_t actual;
> -
> -    if (kvm_enabled()&&  !kvm_has_sync_mmu()) {
> -        qemu_error_new(QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
> -        return 0;
> -    }
> -
> -    actual = qemu_balloon_status();
> -    if (actual == 0) {
> -        qemu_error_new(QERR_DEVICE_NOT_ACTIVE, "balloon");
> -        return 0;
> -    }
> -
> -    return actual;
> -}
> +    Monitor *mon = opaque;
>
> -/**
> - * do_balloon(): Request VM to change its memory allocation
> - */
> -static void do_balloon(Monitor *mon, const QDict *qdict, QObject **ret_data)
> -{
> -    if (balloon_get_value()) {
> -        /* ballooning is active */
> -        qemu_balloon(qdict_get_int(qdict, "value"));
> -    }
> +    if (strcmp(key, "actual"))
> +        monitor_printf(mon, ",%s=%" PRId64, key,
> +                       qint_get_int(qobject_to_qint(obj)));
>   }
>
>   static void monitor_print_balloon(Monitor *mon, const QObject *data)
> @@ -2188,31 +2168,74 @@ static void monitor_print_balloon(Monitor *mon, const QObject *data)
>       QDict *qdict;
>
>       qdict = qobject_to_qdict(data);
> +    if (!qdict_haskey(qdict, "actual"))
> +        return;
>
> -    monitor_printf(mon, "balloon: actual=%" PRId64 "\n",
> -                        qdict_get_int(qdict, "balloon")>>  20);
> +    monitor_printf(mon, "balloon: actual=%" PRId64,
> +                   qdict_get_int(qdict, "actual")>>  20);
> +    qdict_iter(qdict, print_balloon_stat, mon);
> +    monitor_printf(mon, "\n");
>   }
>
>   /**
>    * do_info_balloon(): Balloon information
>    *
> - * Return a QDict with the following information:
> + * Make an asynchronous request for balloon info.  When the request completes
> + * a QDict will be returned according to the following specification:
>    *
> - * - "balloon": current balloon value in bytes
> + * - "actual": current balloon value in bytes
> + * The following fields may or may not be present:
> + * - "mem_swapped_in": Amount of memory swapped in (bytes)
> + * - "mem_swapped_out": Amount of memory swapped out (bytes)
> + * - "major_page_faults": Number of major faults
> + * - "minor_page_faults": Number of minor faults
> + * - "free_mem": Total amount of free and unused memory (bytes)
> + * - "total_mem": Total amount of available memory (bytes)
>    *
>    * Example:
>    *
> - * { "balloon": 1073741824 }
> + * { "actual": 1073741824, "mem_swapped_in": 0, "mem_swapped_out": 0,
> + *   "major_page_faults": 142, "minor_page_faults": 239245,
> + *   "free_mem": 1014185984, "total_mem": 1044668416 }
> + */
> +static int do_info_balloon(Monitor *mon, MonitorCompletion cb, void *opaque)
> +{
> +    int ret;
> +
> +    if (kvm_enabled()&&  !kvm_has_sync_mmu()) {
> +        qemu_error_new(QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
> +        return -1;
> +    }
> +
> +    ret = qemu_balloon_status(cb, opaque);
> +    if (!ret) {
> +        qemu_error_new(QERR_DEVICE_NOT_ACTIVE, "balloon");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +/**
> + * do_balloon(): Request VM to change its memory allocation
>    */
> -static void do_info_balloon(Monitor *mon, QObject **ret_data)
> +static int do_balloon(Monitor *mon, const QDict *params,
> +                       MonitorCompletion cb, void *opaque)
>   {
> -    ram_addr_t actual;
> +    int ret;
> +
> +    if (kvm_enabled()&&  !kvm_has_sync_mmu()) {
> +        qemu_error_new(QERR_KVM_MISSING_CAP, "synchronous MMU", "balloon");
> +        return -1;
> +    }
>
> -    actual = balloon_get_value();
> -    if (actual != 0) {
> -        *ret_data = qobject_from_jsonf("{ 'balloon': %" PRId64 "}",
> -                                       (int64_t) actual);
> +    ret = qemu_balloon(qdict_get_int(params, "value"), cb, opaque);
> +    if (ret == 0) {
> +        qemu_error_new(QERR_DEVICE_NOT_ACTIVE, "balloon");
> +        return -1;
>       }
> +
> +    return 0;
>   }
>
>   static qemu_acl *find_acl(Monitor *mon, const char *name)
> @@ -2697,7 +2720,8 @@ static const mon_cmd_t info_cmds[] = {
>           .params     = "",
>           .help       = "show balloon information",
>           .user_print = monitor_print_balloon,
> -        .mhandler.info_new = do_info_balloon,
> +        .mhandler.info_async = do_info_balloon,
> +        .async      = 1,
>       },
>       {
>           .name       = "qtree",
> diff --git a/qemu-monitor.hx b/qemu-monitor.hx
> index 1aa7818..62e6e81 100644
> --- a/qemu-monitor.hx
> +++ b/qemu-monitor.hx
> @@ -890,7 +890,8 @@ ETEXI
>           .params     = "target",
>           .help       = "request VM to change it's memory allocation (in MB)",
>           .user_print = monitor_user_noop,
> -        .mhandler.cmd_new = do_balloon,
> +        .mhandler.cmd_async = do_balloon,
> +        .async      = 1,
>       },
>
>   STEXI
> diff --git a/vl.c b/vl.c
> index e070ec9..5c5b991 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -364,17 +364,24 @@ void qemu_add_balloon_handler(QEMUBalloonEvent *func, void *opaque)
>       qemu_balloon_event_opaque = opaque;
>   }
>
> -void qemu_balloon(ram_addr_t target)
> +int qemu_balloon(ram_addr_t target, MonitorCompletion cb, void *opaque)
>   {
> -    if (qemu_balloon_event)
> -        qemu_balloon_event(qemu_balloon_event_opaque, target);
> +    if (qemu_balloon_event) {
> +        qemu_balloon_event(qemu_balloon_event_opaque, target, cb, opaque);
> +        return 1;
> +    } else {
> +        return 0;
> +    }
>   }
>
> -ram_addr_t qemu_balloon_status(void)
> +int qemu_balloon_status(MonitorCompletion cb, void *opaque)
>   {
> -    if (qemu_balloon_event)
> -        return qemu_balloon_event(qemu_balloon_event_opaque, 0);
> -    return 0;
> +    if (qemu_balloon_event) {
> +        qemu_balloon_event(qemu_balloon_event_opaque, 0, cb, opaque);
> +        return 1;
> +    } else {
> +        return 0;
> +    }
>   }
>
>
>
>
>    

  reply	other threads:[~2010-01-27  0:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-26 20:17 [Qemu-devel] virtio: Add memory statistics reporting to the balloon driver (V8) Adam Litke
2010-01-27  0:08 ` Anthony Liguori [this message]
2010-03-09 13:51 ` [Qemu-devel] " Juan Quintela
2010-03-09 14:22   ` Luiz Capitulino
2010-03-09 14:48     ` Adam Litke
2010-03-09 14:56       ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B5F83E2.1060109@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=agl@us.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=avi@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).