* [Qemu-devel] [RFC 0/3] auto-ballooning prototype (host part) @ 2012-12-18 20:16 Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 1/3] virtio-balloon: add guest_get_actual_ram() Luiz Capitulino ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Luiz Capitulino @ 2012-12-18 20:16 UTC (permalink / raw) To: qemu-devel; +Cc: aquini, mst, anton.vorontsov, agl, amit.shah Hi, This series implements an early protoype of a new feature called automatic ballooning. This is based on ideas by Rik van Riel and I also got some help from Rafael Aquini (misconceptions and bugs are all mine, though). The auto-ballooning feature automatically performs balloon inflate and deflate based on host and guest memory pressure. This can help to avoid swapping or worse in both, host and guest. This series implements the host part. Full details on the design and implementation can be found in patch 3/3. To test this you will also need the guest part, which I'll post shortly. Any feedback is appreciated! Luiz Capitulino (3): virtio-balloon: add guest_get_actual_ram() virtio-balloon: add virtio_balloon_conf skeleton virtio-balloon: add auto-ballooning support hw/virtio-balloon.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++++++- hw/virtio-balloon.h | 8 +++ hw/virtio-pci.c | 7 ++- hw/virtio-pci.h | 2 + hw/virtio.h | 3 +- 5 files changed, 181 insertions(+), 5 deletions(-) -- 1.8.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [RFC 1/3] virtio-balloon: add guest_get_actual_ram() 2012-12-18 20:16 [Qemu-devel] [RFC 0/3] auto-ballooning prototype (host part) Luiz Capitulino @ 2012-12-18 20:16 ` Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 2/3] virtio-balloon: add virtio_balloon_conf skeleton Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support Luiz Capitulino 2 siblings, 0 replies; 10+ messages in thread From: Luiz Capitulino @ 2012-12-18 20:16 UTC (permalink / raw) To: qemu-devel; +Cc: aquini, mst, anton.vorontsov, agl, amit.shah A future commit will also want to use this. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> --- hw/virtio-balloon.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c index dd1a650..03248df 100644 --- a/hw/virtio-balloon.c +++ b/hw/virtio-balloon.c @@ -161,6 +161,11 @@ static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) return f; } +static ram_addr_t guest_get_actual_ram(const VirtIOBalloon *s) +{ + return ram_size - ((uint64_t) s->actual << VIRTIO_BALLOON_PFN_SHIFT); +} + static void virtio_balloon_stat(void *opaque, BalloonInfo *info) { VirtIOBalloon *dev = opaque; @@ -186,8 +191,7 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo *info) */ reset_stats(dev); - info->actual = ram_size - ((uint64_t) dev->actual << - VIRTIO_BALLOON_PFN_SHIFT); + info->actual = guest_get_actual_ram(dev); } static void virtio_balloon_to_target(void *opaque, ram_addr_t target) -- 1.8.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [RFC 2/3] virtio-balloon: add virtio_balloon_conf skeleton 2012-12-18 20:16 [Qemu-devel] [RFC 0/3] auto-ballooning prototype (host part) Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 1/3] virtio-balloon: add guest_get_actual_ram() Luiz Capitulino @ 2012-12-18 20:16 ` Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support Luiz Capitulino 2 siblings, 0 replies; 10+ messages in thread From: Luiz Capitulino @ 2012-12-18 20:16 UTC (permalink / raw) To: qemu-devel; +Cc: aquini, mst, anton.vorontsov, agl, amit.shah Next commit wants it. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> --- hw/virtio-balloon.c | 2 +- hw/virtio-balloon.h | 4 ++++ hw/virtio-pci.c | 2 +- hw/virtio-pci.h | 2 ++ hw/virtio.h | 3 ++- 5 files changed, 10 insertions(+), 3 deletions(-) diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c index 03248df..97d49b1 100644 --- a/hw/virtio-balloon.c +++ b/hw/virtio-balloon.c @@ -235,7 +235,7 @@ static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id) return 0; } -VirtIODevice *virtio_balloon_init(DeviceState *dev) +VirtIODevice *virtio_balloon_init(DeviceState *dev, virtio_balloon_conf *conf) { VirtIOBalloon *s; int ret; diff --git a/hw/virtio-balloon.h b/hw/virtio-balloon.h index 73300dd..9d631d5 100644 --- a/hw/virtio-balloon.h +++ b/hw/virtio-balloon.h @@ -38,6 +38,10 @@ struct virtio_balloon_config uint32_t actual; }; +typedef struct virtio_balloon_conf +{ +} virtio_balloon_conf; + /* Memory Statistics */ #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */ #define VIRTIO_BALLOON_S_SWAP_OUT 1 /* Amount of memory swapped out */ diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index 7684ac9..026222b 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -835,7 +835,7 @@ static int virtio_balloon_init_pci(PCIDevice *pci_dev) proxy->class_code = PCI_CLASS_OTHERS; } - vdev = virtio_balloon_init(&pci_dev->qdev); + vdev = virtio_balloon_init(&pci_dev->qdev, &proxy->balloon); if (!vdev) { return -1; } diff --git a/hw/virtio-pci.h b/hw/virtio-pci.h index b58d9a2..3e4ca0d 100644 --- a/hw/virtio-pci.h +++ b/hw/virtio-pci.h @@ -20,6 +20,7 @@ #include "virtio-rng.h" #include "virtio-serial.h" #include "virtio-scsi.h" +#include "virtio-balloon.h" /* Performance improves when virtqueue kick processing is decoupled from the * vcpu thread using ioeventfd for some devices. */ @@ -46,6 +47,7 @@ typedef struct { #endif virtio_serial_conf serial; virtio_net_conf net; + virtio_balloon_conf balloon; VirtIOSCSIConf scsi; VirtIORNGConf rng; bool ioeventfd_disabled; diff --git a/hw/virtio.h b/hw/virtio.h index 7c17f7b..9a85a41 100644 --- a/hw/virtio.h +++ b/hw/virtio.h @@ -201,7 +201,8 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf, struct virtio_net_conf *net); typedef struct virtio_serial_conf virtio_serial_conf; VirtIODevice *virtio_serial_init(DeviceState *dev, virtio_serial_conf *serial); -VirtIODevice *virtio_balloon_init(DeviceState *dev); +typedef struct virtio_balloon_conf virtio_balloon_conf; +VirtIODevice *virtio_balloon_init(DeviceState *dev, virtio_balloon_conf *conf); typedef struct VirtIOSCSIConf VirtIOSCSIConf; VirtIODevice *virtio_scsi_init(DeviceState *dev, VirtIOSCSIConf *conf); typedef struct VirtIORNGConf VirtIORNGConf; -- 1.8.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support 2012-12-18 20:16 [Qemu-devel] [RFC 0/3] auto-ballooning prototype (host part) Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 1/3] virtio-balloon: add guest_get_actual_ram() Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 2/3] virtio-balloon: add virtio_balloon_conf skeleton Luiz Capitulino @ 2012-12-18 20:16 ` Luiz Capitulino 2012-12-18 22:53 ` Anton Vorontsov 2013-01-11 20:32 ` Amit Shah 2 siblings, 2 replies; 10+ messages in thread From: Luiz Capitulino @ 2012-12-18 20:16 UTC (permalink / raw) To: qemu-devel; +Cc: aquini, mst, anton.vorontsov, agl, amit.shah The auto-ballooning feature automatically performs balloon inflate or deflate based on host and guest memory pressure. This can help to avoid swapping or worse in both, host and guest. Auto-ballooning has a host and a guest part. The host performs automatic inflate by requesting the guest to inflate its balloon when the host is facing memory pressure. The guest performs automatic deflate when it's facing memory pressure itself. It's expected that auto-inflate and auto-deflate will balance each other over time. This commit implements the host side of auto-ballooning. To be notified of host memory pressure, this commit makes use of this kernel API proposal being discussed upstream: http://marc.info/?l=linux-mm&m=135513372205134&w=2 Three new properties are added to the virtio-balloon device to activate auto-ballooning: o auto-balloon-mempressure-path: this is the path for the kernel's mempressure cgroup notification dir, which must be already mounted (see link above for details on this) o auto-balloon-level: the memory pressure level to trigger auto-balloon. Valid values are: - low: the kernel is reclaiming memory for new allocations - medium: some swapping activity has already started - oom: the kernel will start playing russian roulette real soon o auto-balloon-granularity: percentage of current guest memory by which the balloon should be inflated. For example, a value of 1 corresponds to 1% which means that a guest with 1G of memory will get its balloon inflated to 10485K. To test this, you need a kernel with the mempressure API patch applied and the guest side of auto-ballooning. Then the feature can be enabled like: qemu [...] \ -balloon virtio,auto-balloon-mempressure-path=/sys/fs/cgroup/mempressure/,auto-balloon-level=low,auto-balloon-granularity=1 FIXMEs: o rate-limit the event? Can receive several in a row o add auto-balloon-maximum to limit the inflate? o this shouldn't override balloon changes done by the user manually Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> --- hw/virtio-balloon.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++ hw/virtio-balloon.h | 4 ++ hw/virtio-pci.c | 5 ++ 3 files changed, 165 insertions(+) diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c index 97d49b1..40a97e7 100644 --- a/hw/virtio-balloon.c +++ b/hw/virtio-balloon.c @@ -37,6 +37,13 @@ typedef struct VirtIOBalloon VirtQueueElement stats_vq_elem; size_t stats_vq_offset; DeviceState *qdev; + + /* auto-balloon */ + bool auto_balloon_enabled; + int cfd; + int lfd; + float granularity; + EventNotifier mempressure_ev; } VirtIOBalloon; static VirtIOBalloon *to_virtio_balloon(VirtIODevice *vdev) @@ -157,7 +164,14 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) { + VirtIOBalloon *s = to_virtio_balloon(vdev); + f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); + + if (s->auto_balloon_enabled) { + f |= (1 << VIRTIO_BALLOON_F_AUTO_BALLOON); + } + return f; } @@ -166,6 +180,11 @@ static ram_addr_t guest_get_actual_ram(const VirtIOBalloon *s) return ram_size - ((uint64_t) s->actual << VIRTIO_BALLOON_PFN_SHIFT); } +static bool guest_supports_auto_balloon(const VirtIOBalloon *s) +{ + return s->vdev.guest_features & (1 << VIRTIO_BALLOON_F_AUTO_BALLOON); +} + static void virtio_balloon_stat(void *opaque, BalloonInfo *info) { VirtIOBalloon *dev = opaque; @@ -235,6 +254,133 @@ static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id) return 0; } +static int open_sysfile(const char *path, const char *file, mode_t mode) +{ + char *p; + int fd; + + p = g_strjoin("/", path, file, NULL); + fd = qemu_open(p, mode); + if (fd < 0) { + error_report("balloon: can't open '%s': %s", p, strerror(errno)); + } + + g_free(p); + return fd; +} + +static int balloon_ack_event(EventNotifier *ev) +{ + uint64_t res; + int ret, fd; + + fd = event_notifier_get_fd(ev); + + do { + ret = read(fd, &res, sizeof(res)); + } while (ret == -1 && errno == EINTR); + + return ret; +} + +static void host_mempressure_cleanup(VirtIOBalloon *s); + +static void host_mempressure_cb(EventNotifier *ev) +{ + VirtIOBalloon *s = container_of(ev, VirtIOBalloon, mempressure_ev); + ram_addr_t target; + int ret; + + ret = balloon_ack_event(&s->mempressure_ev); + if (ret < 0) { + fprintf(stderr, "balloon: failed to ack event: %s\n", strerror(errno)); + return; + } + + if (!guest_supports_auto_balloon(s)) { + fprintf(stderr, + "balloon: oops guest doesn't support auto-ballooning, disabling..\n"); + host_mempressure_cleanup(s); + return; + } + + target = guest_get_actual_ram(s) - + (guest_get_actual_ram(s) * s->granularity); + virtio_balloon_to_target(s, target); +} + +static int host_mempressure_init(VirtIOBalloon *s, + const virtio_balloon_conf *conf) +{ + char *line; + int ret, fd; + + if (!conf->path || !conf->level) { + error_report("balloon: mempressure path or level missing"); + return -1; + } + + if (conf->granularity > 100) { + error_report("balloon: invalid granularity value (should be 0..100)"); + return -1; + } + + s->lfd = open_sysfile(conf->path, "mempressure.level", O_RDONLY); + if (s->lfd < 0) { + return -1; + } + + s->cfd = open_sysfile(conf->path, "cgroup.event_control", O_WRONLY); + if (s->cfd < 0) { + close(s->lfd); + return -1; + } + + ret = event_notifier_init(&s->mempressure_ev, false); + if (ret < 0) { + error_report("failed to create notifier: %s", strerror(-ret)); + goto out_err; + } + + fd = event_notifier_get_fd(&s->mempressure_ev); + line = g_strdup_printf("%d %d %s", fd, s->lfd, conf->level); + + do { + ret = write(s->cfd, line, strlen(line)); + } while (ret < 0 && errno == EINTR); + + if (ret < 0) { + error_report("balloon: write failed: %s", strerror(errno)); + g_free(line); + goto out_ev; + } + + g_free(line); + + s->auto_balloon_enabled = true; + s->granularity = conf->granularity / 100.0; + event_notifier_set_handler(&s->mempressure_ev, host_mempressure_cb); + + return 0; + +out_ev: + event_notifier_cleanup(&s->mempressure_ev); +out_err: + close(s->lfd); + close(s->cfd); + return -1; +} + +static void host_mempressure_cleanup(VirtIOBalloon *s) +{ + if (s->auto_balloon_enabled) { + close(s->lfd); + close(s->cfd); + event_notifier_cleanup(&s->mempressure_ev); + s->auto_balloon_enabled = false; + } +} + VirtIODevice *virtio_balloon_init(DeviceState *dev, virtio_balloon_conf *conf) { VirtIOBalloon *s; @@ -248,9 +394,18 @@ VirtIODevice *virtio_balloon_init(DeviceState *dev, virtio_balloon_conf *conf) s->vdev.set_config = virtio_balloon_set_config; s->vdev.get_features = virtio_balloon_get_features; + if (conf->path || conf->level || conf->granularity > 0) { + ret = host_mempressure_init(s, conf); + if (ret < 0) { + virtio_cleanup(&s->vdev); + return NULL; + } + } + ret = qemu_add_balloon_handler(virtio_balloon_to_target, virtio_balloon_stat, s); if (ret < 0) { + host_mempressure_cleanup(s); virtio_cleanup(&s->vdev); return NULL; } @@ -273,6 +428,7 @@ void virtio_balloon_exit(VirtIODevice *vdev) VirtIOBalloon *s = DO_UPCAST(VirtIOBalloon, vdev, vdev); qemu_remove_balloon_handler(s); + host_mempressure_cleanup(s); unregister_savevm(s->qdev, "virtio-balloon", s); virtio_cleanup(vdev); } diff --git a/hw/virtio-balloon.h b/hw/virtio-balloon.h index 9d631d5..fcf0e3c 100644 --- a/hw/virtio-balloon.h +++ b/hw/virtio-balloon.h @@ -26,6 +26,7 @@ /* The feature bitmap for virtio balloon */ #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory stats virtqueue */ +#define VIRTIO_BALLOON_F_AUTO_BALLOON 2 /* Automatic ballooning */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 @@ -40,6 +41,9 @@ struct virtio_balloon_config typedef struct virtio_balloon_conf { + char *path; + char *level; + uint32_t granularity; } virtio_balloon_conf; /* Memory Statistics */ diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index 026222b..487b7f2 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -991,6 +991,11 @@ static TypeInfo virtio_serial_info = { static Property virtio_balloon_properties[] = { DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features), DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0), +#ifdef __linux__ + DEFINE_PROP_STRING("auto-balloon-mempressure-path", VirtIOPCIProxy, balloon.path), + DEFINE_PROP_STRING("auto-balloon-level", VirtIOPCIProxy, balloon.level), + DEFINE_PROP_UINT32("auto-balloon-granularity", VirtIOPCIProxy, balloon.granularity, 0), +#endif DEFINE_PROP_END_OF_LIST(), }; -- 1.8.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support 2012-12-18 20:16 ` [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support Luiz Capitulino @ 2012-12-18 22:53 ` Anton Vorontsov 2012-12-19 11:30 ` Luiz Capitulino 2013-01-11 20:32 ` Amit Shah 1 sibling, 1 reply; 10+ messages in thread From: Anton Vorontsov @ 2012-12-18 22:53 UTC (permalink / raw) To: Luiz Capitulino Cc: Michal Hocko, aquini, mst, linux-kernel, David Rientjes, qemu-devel, Glauber Costa, Pekka Enberg, linux-mm, John Stultz, Mel Gorman, agl, amit.shah, kirill, Andrew Morton Hello Luiz, On Tue, Dec 18, 2012 at 06:16:55PM -0200, Luiz Capitulino wrote: > The auto-ballooning feature automatically performs balloon inflate > or deflate based on host and guest memory pressure. This can help to > avoid swapping or worse in both, host and guest. > > Auto-ballooning has a host and a guest part. The host performs > automatic inflate by requesting the guest to inflate its balloon > when the host is facing memory pressure. The guest performs > automatic deflate when it's facing memory pressure itself. It's > expected that auto-inflate and auto-deflate will balance each > other over time. > > This commit implements the host side of auto-ballooning. > > To be notified of host memory pressure, this commit makes use of this > kernel API proposal being discussed upstream: > > http://marc.info/?l=linux-mm&m=135513372205134&w=2 Wow, you're fast! And I'm glad that it works for you, so we have two full-featured mempressure cgroup users already. Even though it is a qemu patch, I think we should Cc linux-mm folks on it, just to let them know the great news. Thanks! > Three new properties are added to the virtio-balloon device to activate > auto-ballooning: > > o auto-balloon-mempressure-path: this is the path for the kernel's > mempressure cgroup notification dir, which must be already mounted > (see link above for details on this) > > o auto-balloon-level: the memory pressure level to trigger auto-balloon. > Valid values are: > > - low: the kernel is reclaiming memory for new allocations > - medium: some swapping activity has already started > - oom: the kernel will start playing russian roulette real soon > > o auto-balloon-granularity: percentage of current guest memory by which > the balloon should be inflated. For example, a value of 1 corresponds > to 1% which means that a guest with 1G of memory will get its balloon > inflated to 10485K. > > To test this, you need a kernel with the mempressure API patch applied and > the guest side of auto-ballooning. > > Then the feature can be enabled like: > > qemu [...] \ > -balloon virtio,auto-balloon-mempressure-path=/sys/fs/cgroup/mempressure/,auto-balloon-level=low,auto-balloon-granularity=1 > > FIXMEs: > > o rate-limit the event? Can receive several in a row > o add auto-balloon-maximum to limit the inflate? > o this shouldn't override balloon changes done by the user manually > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > --- > hw/virtio-balloon.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > hw/virtio-balloon.h | 4 ++ > hw/virtio-pci.c | 5 ++ > 3 files changed, 165 insertions(+) > > diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c > index 97d49b1..40a97e7 100644 > --- a/hw/virtio-balloon.c > +++ b/hw/virtio-balloon.c > @@ -37,6 +37,13 @@ typedef struct VirtIOBalloon > VirtQueueElement stats_vq_elem; > size_t stats_vq_offset; > DeviceState *qdev; > + > + /* auto-balloon */ > + bool auto_balloon_enabled; > + int cfd; > + int lfd; > + float granularity; > + EventNotifier mempressure_ev; > } VirtIOBalloon; > > static VirtIOBalloon *to_virtio_balloon(VirtIODevice *vdev) > @@ -157,7 +164,14 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > > static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) > { > + VirtIOBalloon *s = to_virtio_balloon(vdev); > + > f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); > + > + if (s->auto_balloon_enabled) { > + f |= (1 << VIRTIO_BALLOON_F_AUTO_BALLOON); > + } > + > return f; > } > > @@ -166,6 +180,11 @@ static ram_addr_t guest_get_actual_ram(const VirtIOBalloon *s) > return ram_size - ((uint64_t) s->actual << VIRTIO_BALLOON_PFN_SHIFT); > } > > +static bool guest_supports_auto_balloon(const VirtIOBalloon *s) > +{ > + return s->vdev.guest_features & (1 << VIRTIO_BALLOON_F_AUTO_BALLOON); > +} > + > static void virtio_balloon_stat(void *opaque, BalloonInfo *info) > { > VirtIOBalloon *dev = opaque; > @@ -235,6 +254,133 @@ static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id) > return 0; > } > > +static int open_sysfile(const char *path, const char *file, mode_t mode) > +{ > + char *p; > + int fd; > + > + p = g_strjoin("/", path, file, NULL); > + fd = qemu_open(p, mode); > + if (fd < 0) { > + error_report("balloon: can't open '%s': %s", p, strerror(errno)); > + } > + > + g_free(p); > + return fd; > +} > + > +static int balloon_ack_event(EventNotifier *ev) > +{ > + uint64_t res; > + int ret, fd; > + > + fd = event_notifier_get_fd(ev); > + > + do { > + ret = read(fd, &res, sizeof(res)); > + } while (ret == -1 && errno == EINTR); > + > + return ret; > +} > + > +static void host_mempressure_cleanup(VirtIOBalloon *s); > + > +static void host_mempressure_cb(EventNotifier *ev) > +{ > + VirtIOBalloon *s = container_of(ev, VirtIOBalloon, mempressure_ev); > + ram_addr_t target; > + int ret; > + > + ret = balloon_ack_event(&s->mempressure_ev); > + if (ret < 0) { > + fprintf(stderr, "balloon: failed to ack event: %s\n", strerror(errno)); > + return; > + } > + > + if (!guest_supports_auto_balloon(s)) { > + fprintf(stderr, > + "balloon: oops guest doesn't support auto-ballooning, disabling..\n"); > + host_mempressure_cleanup(s); > + return; > + } > + > + target = guest_get_actual_ram(s) - > + (guest_get_actual_ram(s) * s->granularity); > + virtio_balloon_to_target(s, target); > +} > + > +static int host_mempressure_init(VirtIOBalloon *s, > + const virtio_balloon_conf *conf) > +{ > + char *line; > + int ret, fd; > + > + if (!conf->path || !conf->level) { > + error_report("balloon: mempressure path or level missing"); > + return -1; > + } > + > + if (conf->granularity > 100) { > + error_report("balloon: invalid granularity value (should be 0..100)"); > + return -1; > + } > + > + s->lfd = open_sysfile(conf->path, "mempressure.level", O_RDONLY); > + if (s->lfd < 0) { > + return -1; > + } > + > + s->cfd = open_sysfile(conf->path, "cgroup.event_control", O_WRONLY); > + if (s->cfd < 0) { > + close(s->lfd); > + return -1; > + } > + > + ret = event_notifier_init(&s->mempressure_ev, false); > + if (ret < 0) { > + error_report("failed to create notifier: %s", strerror(-ret)); > + goto out_err; > + } > + > + fd = event_notifier_get_fd(&s->mempressure_ev); > + line = g_strdup_printf("%d %d %s", fd, s->lfd, conf->level); > + > + do { > + ret = write(s->cfd, line, strlen(line)); > + } while (ret < 0 && errno == EINTR); > + > + if (ret < 0) { > + error_report("balloon: write failed: %s", strerror(errno)); > + g_free(line); > + goto out_ev; > + } > + > + g_free(line); > + > + s->auto_balloon_enabled = true; > + s->granularity = conf->granularity / 100.0; > + event_notifier_set_handler(&s->mempressure_ev, host_mempressure_cb); > + > + return 0; > + > +out_ev: > + event_notifier_cleanup(&s->mempressure_ev); > +out_err: > + close(s->lfd); > + close(s->cfd); > + return -1; > +} > + > +static void host_mempressure_cleanup(VirtIOBalloon *s) > +{ > + if (s->auto_balloon_enabled) { > + close(s->lfd); > + close(s->cfd); > + event_notifier_cleanup(&s->mempressure_ev); > + s->auto_balloon_enabled = false; > + } > +} > + > VirtIODevice *virtio_balloon_init(DeviceState *dev, virtio_balloon_conf *conf) > { > VirtIOBalloon *s; > @@ -248,9 +394,18 @@ VirtIODevice *virtio_balloon_init(DeviceState *dev, virtio_balloon_conf *conf) > s->vdev.set_config = virtio_balloon_set_config; > s->vdev.get_features = virtio_balloon_get_features; > > + if (conf->path || conf->level || conf->granularity > 0) { > + ret = host_mempressure_init(s, conf); > + if (ret < 0) { > + virtio_cleanup(&s->vdev); > + return NULL; > + } > + } > + > ret = qemu_add_balloon_handler(virtio_balloon_to_target, > virtio_balloon_stat, s); > if (ret < 0) { > + host_mempressure_cleanup(s); > virtio_cleanup(&s->vdev); > return NULL; > } > @@ -273,6 +428,7 @@ void virtio_balloon_exit(VirtIODevice *vdev) > VirtIOBalloon *s = DO_UPCAST(VirtIOBalloon, vdev, vdev); > > qemu_remove_balloon_handler(s); > + host_mempressure_cleanup(s); > unregister_savevm(s->qdev, "virtio-balloon", s); > virtio_cleanup(vdev); > } > diff --git a/hw/virtio-balloon.h b/hw/virtio-balloon.h > index 9d631d5..fcf0e3c 100644 > --- a/hw/virtio-balloon.h > +++ b/hw/virtio-balloon.h > @@ -26,6 +26,7 @@ > /* The feature bitmap for virtio balloon */ > #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ > #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory stats virtqueue */ > +#define VIRTIO_BALLOON_F_AUTO_BALLOON 2 /* Automatic ballooning */ > > /* Size of a PFN in the balloon interface. */ > #define VIRTIO_BALLOON_PFN_SHIFT 12 > @@ -40,6 +41,9 @@ struct virtio_balloon_config > > typedef struct virtio_balloon_conf > { > + char *path; > + char *level; > + uint32_t granularity; > } virtio_balloon_conf; > > /* Memory Statistics */ > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c > index 026222b..487b7f2 100644 > --- a/hw/virtio-pci.c > +++ b/hw/virtio-pci.c > @@ -991,6 +991,11 @@ static TypeInfo virtio_serial_info = { > static Property virtio_balloon_properties[] = { > DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features), > DEFINE_PROP_HEX32("class", VirtIOPCIProxy, class_code, 0), > +#ifdef __linux__ > + DEFINE_PROP_STRING("auto-balloon-mempressure-path", VirtIOPCIProxy, balloon.path), > + DEFINE_PROP_STRING("auto-balloon-level", VirtIOPCIProxy, balloon.level), > + DEFINE_PROP_UINT32("auto-balloon-granularity", VirtIOPCIProxy, balloon.granularity, 0), > +#endif > DEFINE_PROP_END_OF_LIST(), > }; > > -- > 1.8.0 > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support 2012-12-18 22:53 ` Anton Vorontsov @ 2012-12-19 11:30 ` Luiz Capitulino 2012-12-20 5:24 ` Dietmar Maurer 0 siblings, 1 reply; 10+ messages in thread From: Luiz Capitulino @ 2012-12-19 11:30 UTC (permalink / raw) To: Anton Vorontsov Cc: Michal Hocko, aquini, mst, linux-kernel, David Rientjes, qemu-devel, Glauber Costa, Pekka Enberg, linux-mm, John Stultz, Mel Gorman, agl, amit.shah, kirill, Andrew Morton On Tue, 18 Dec 2012 14:53:30 -0800 Anton Vorontsov <anton.vorontsov@linaro.org> wrote: > Hello Luiz, > > On Tue, Dec 18, 2012 at 06:16:55PM -0200, Luiz Capitulino wrote: > > The auto-ballooning feature automatically performs balloon inflate > > or deflate based on host and guest memory pressure. This can help to > > avoid swapping or worse in both, host and guest. > > > > Auto-ballooning has a host and a guest part. The host performs > > automatic inflate by requesting the guest to inflate its balloon > > when the host is facing memory pressure. The guest performs > > automatic deflate when it's facing memory pressure itself. It's > > expected that auto-inflate and auto-deflate will balance each > > other over time. > > > > This commit implements the host side of auto-ballooning. > > > > To be notified of host memory pressure, this commit makes use of this > > kernel API proposal being discussed upstream: > > > > http://marc.info/?l=linux-mm&m=135513372205134&w=2 > > Wow, you're fast! And I'm glad that it works for you, so we have two > full-featured mempressure cgroup users already. Thanks, although I think we need more testing to be sure this does what we want. I mean, the basic mechanics does work, but my testing has been very light so far. > Even though it is a qemu patch, I think we should Cc linux-mm folks on it, > just to let them know the great news. I'll do it next time. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support 2012-12-19 11:30 ` Luiz Capitulino @ 2012-12-20 5:24 ` Dietmar Maurer 2012-12-22 21:45 ` Luiz Capitulino 0 siblings, 1 reply; 10+ messages in thread From: Dietmar Maurer @ 2012-12-20 5:24 UTC (permalink / raw) To: Luiz Capitulino, Anton Vorontsov Cc: Pekka Enberg, aquini@redhat.com, mst@redhat.com, agl@us.ibm.com, linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, Michal Hocko, linux-mm@kvack.org, John Stultz, Mel Gorman, David Rientjes, amit.shah@redhat.com, kirill@shutemov.name, Andrew Morton, Glauber Costa > > Wow, you're fast! And I'm glad that it works for you, so we have two > > full-featured mempressure cgroup users already. > > Thanks, although I think we need more testing to be sure this does what we > want. I mean, the basic mechanics does work, but my testing has been very > light so far. Is it possible to assign different weights for different VMs, something like the vmware 'shares' setting? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support 2012-12-20 5:24 ` Dietmar Maurer @ 2012-12-22 21:45 ` Luiz Capitulino 0 siblings, 0 replies; 10+ messages in thread From: Luiz Capitulino @ 2012-12-22 21:45 UTC (permalink / raw) To: Dietmar Maurer Cc: Pekka Enberg, aquini@redhat.com, mst@redhat.com, agl@us.ibm.com, linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, Michal Hocko, Anton Vorontsov, John Stultz, Mel Gorman, David Rientjes, amit.shah@redhat.com, kirill@shutemov.name, Andrew Morton, Glauber Costa, linux-mm@kvack.org On Thu, 20 Dec 2012 05:24:12 +0000 Dietmar Maurer <dietmar@proxmox.com> wrote: > > > Wow, you're fast! And I'm glad that it works for you, so we have two > > > full-featured mempressure cgroup users already. > > > > Thanks, although I think we need more testing to be sure this does what we > > want. I mean, the basic mechanics does work, but my testing has been very > > light so far. > > Is it possible to assign different weights for different VMs, something like the vmware 'shares' setting? This series doesn't have the "weight" concept, it has auto-balloon-level and auto-balloon-granularity. The former allows you to choose which type of kernel low-mem level you want auto-inflate to trigger. The latter allows you to say by how much the balloon should grow (as a percentage of the guest's current memory). Both of them are per VM. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support 2012-12-18 20:16 ` [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support Luiz Capitulino 2012-12-18 22:53 ` Anton Vorontsov @ 2013-01-11 20:32 ` Amit Shah 2013-01-14 11:58 ` Luiz Capitulino 1 sibling, 1 reply; 10+ messages in thread From: Amit Shah @ 2013-01-11 20:32 UTC (permalink / raw) To: Luiz Capitulino; +Cc: aquini, mst, qemu-devel, anton.vorontsov, agl Hi Luiz, On (Tue) 18 Dec 2012 [18:16:55], Luiz Capitulino wrote: > The auto-ballooning feature automatically performs balloon inflate > or deflate based on host and guest memory pressure. This can help to > avoid swapping or worse in both, host and guest. > > Auto-ballooning has a host and a guest part. The host performs > automatic inflate by requesting the guest to inflate its balloon > when the host is facing memory pressure. The guest performs > automatic deflate when it's facing memory pressure itself. It's > expected that auto-inflate and auto-deflate will balance each > other over time. What does this last line mean? > This commit implements the host side of auto-ballooning. > > To be notified of host memory pressure, this commit makes use of this > kernel API proposal being discussed upstream: > > http://marc.info/?l=linux-mm&m=135513372205134&w=2 We should wait till these patches are upstream. Also, an error message better than "can't open file ..." to indicate a newer kernel is needed for this feature? > Three new properties are added to the virtio-balloon device to activate > auto-ballooning: > > o auto-balloon-mempressure-path: this is the path for the kernel's > mempressure cgroup notification dir, which must be already mounted > (see link above for details on this) > > o auto-balloon-level: the memory pressure level to trigger auto-balloon. > Valid values are: > > - low: the kernel is reclaiming memory for new allocations > - medium: some swapping activity has already started > - oom: the kernel will start playing russian roulette real soon > > o auto-balloon-granularity: percentage of current guest memory by which > the balloon should be inflated. For example, a value of 1 corresponds > to 1% which means that a guest with 1G of memory will get its balloon > inflated to 10485K. This looks good. How about emitting a QMP message to notify management of auto-ballooning? > To test this, you need a kernel with the mempressure API patch applied and > the guest side of auto-ballooning. > > Then the feature can be enabled like: > > qemu [...] \ > -balloon virtio,auto-balloon-mempressure-path=/sys/fs/cgroup/mempressure/,auto-balloon-level=low,auto-balloon-granularity=1 > > FIXMEs: > > o rate-limit the event? Can receive several in a row For this, I'm thinking the highest severity level should be picked to act upon: e.g. if the following events are received in succession: medium oom low Then 'oom' is the highest level, and that should be acted upon (i.e. we shouldn't deflate the balloon on getting the 'low' notification above). The guest can always deflate the balloon when it needs RAM. Repeated 'low' notifications can be ignored, if one has been acted upon already. > o add auto-balloon-maximum to limit the inflate? Yes, makes sense to add this. > o this shouldn't override balloon changes done by the user manually Can you think of examples here? If the user (host admin) has ballooned an 8G guest down to 4G, auto-balloon will only further shrink down the guest RAM, so there's no real 'overriding' happening that I can think of. Of course, a guest can expand itself to, say, 5G, but that should be allowed as the guest might be under pressure. Even in such a situation, the host has control by limiting the guest using cgroups. > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > --- > hw/virtio-balloon.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > hw/virtio-balloon.h | 4 ++ > hw/virtio-pci.c | 5 ++ > 3 files changed, 165 insertions(+) Patch looks fine. Amit ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support 2013-01-11 20:32 ` Amit Shah @ 2013-01-14 11:58 ` Luiz Capitulino 0 siblings, 0 replies; 10+ messages in thread From: Luiz Capitulino @ 2013-01-14 11:58 UTC (permalink / raw) To: Amit Shah; +Cc: aquini, mst, qemu-devel, anton.vorontsov, agl On Sat, 12 Jan 2013 02:02:32 +0530 Amit Shah <amit.shah@redhat.com> wrote: > Hi Luiz, > > On (Tue) 18 Dec 2012 [18:16:55], Luiz Capitulino wrote: > > The auto-ballooning feature automatically performs balloon inflate > > or deflate based on host and guest memory pressure. This can help to > > avoid swapping or worse in both, host and guest. > > > > Auto-ballooning has a host and a guest part. The host performs > > automatic inflate by requesting the guest to inflate its balloon > > when the host is facing memory pressure. The guest performs > > automatic deflate when it's facing memory pressure itself. It's > > expected that auto-inflate and auto-deflate will balance each > > other over time. > > What does this last line mean? When qemu does auto-inflate, the guest memory will be reduced. Then it's expected that something will increase it again. That something is auto-deflate. However, if we deflate too much, than the host may face memory pressure, and then auto-inflate will take place again. It's expected that that sequence of auto-inflate and auto-deflate will reach a balance in some point in time. > > This commit implements the host side of auto-ballooning. > > > > To be notified of host memory pressure, this commit makes use of this > > kernel API proposal being discussed upstream: > > > > http://marc.info/?l=linux-mm&m=135513372205134&w=2 > > We should wait till these patches are upstream. Right. > Also, an error > message better than "can't open file ..." to indicate a newer kernel > is needed for this feature? Seems a good idea. > > Three new properties are added to the virtio-balloon device to activate > > auto-ballooning: > > > > o auto-balloon-mempressure-path: this is the path for the kernel's > > mempressure cgroup notification dir, which must be already mounted > > (see link above for details on this) > > > > o auto-balloon-level: the memory pressure level to trigger auto-balloon. > > Valid values are: > > > > - low: the kernel is reclaiming memory for new allocations > > - medium: some swapping activity has already started > > - oom: the kernel will start playing russian roulette real soon > > > > o auto-balloon-granularity: percentage of current guest memory by which > > the balloon should be inflated. For example, a value of 1 corresponds > > to 1% which means that a guest with 1G of memory will get its balloon > > inflated to 10485K. > > This looks good. Actually, for the next version I'll the user space shrinker API. > How about emitting a QMP message to notify > management of auto-ballooning? I could think about that if they are interested. > > To test this, you need a kernel with the mempressure API patch applied and > > the guest side of auto-ballooning. > > > > Then the feature can be enabled like: > > > > qemu [...] \ > > -balloon virtio,auto-balloon-mempressure-path=/sys/fs/cgroup/mempressure/,auto-balloon-level=low,auto-balloon-granularity=1 > > > > FIXMEs: > > > > o rate-limit the event? Can receive several in a row > > For this, I'm thinking the highest severity level should be picked to > act upon: e.g. if the following events are received in succession: > > medium > oom > low > > Then 'oom' is the highest level, and that should be acted upon > (i.e. we shouldn't deflate the balloon on getting the 'low' > notification above). The guest can always deflate the balloon when it > needs RAM. > > Repeated 'low' notifications can be ignored, if one has been acted > upon already. Makes sense. Although, as I said above I'll try the user-space shrinker API for the next version. > o add auto-balloon-maximum to limit the inflate? > > Yes, makes sense to add this. > > > o this shouldn't override balloon changes done by the user manually > > Can you think of examples here? If the user (host admin) has > ballooned an 8G guest down to 4G, auto-balloon will only further > shrink down the guest RAM, so there's no real 'overriding' happening > that I can think of. Of course, a guest can expand itself to, say, > 5G, but that should be allowed as the guest might be under pressure. Yes, that's exactly my point above. I mean, taking your example, if the user has ballooned an 8G down to 4G, should auto-balloon be allowed to balloon to 5G or even back to 8G? > Even in such a situation, the host has control by limiting the guest > using cgroups. > > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > > --- > > hw/virtio-balloon.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > > hw/virtio-balloon.h | 4 ++ > > hw/virtio-pci.c | 5 ++ > > 3 files changed, 165 insertions(+) > > Patch looks fine. > > Amit > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-01-14 21:15 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-12-18 20:16 [Qemu-devel] [RFC 0/3] auto-ballooning prototype (host part) Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 1/3] virtio-balloon: add guest_get_actual_ram() Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 2/3] virtio-balloon: add virtio_balloon_conf skeleton Luiz Capitulino 2012-12-18 20:16 ` [Qemu-devel] [RFC 3/3] virtio-balloon: add auto-ballooning support Luiz Capitulino 2012-12-18 22:53 ` Anton Vorontsov 2012-12-19 11:30 ` Luiz Capitulino 2012-12-20 5:24 ` Dietmar Maurer 2012-12-22 21:45 ` Luiz Capitulino 2013-01-11 20:32 ` Amit Shah 2013-01-14 11:58 ` Luiz Capitulino
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).