* [PATCH v18 QEMU 0/3] virtio-balloon: add support for providing free page reporting @ 2020-04-08 22:55 Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 1/3] virtio-balloon: Implement support for page poison tracking feature Alexander Duyck ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Alexander Duyck @ 2020-04-08 22:55 UTC (permalink / raw) To: david, mst; +Cc: virtio-dev, qemu-devel This series provides an asynchronous means of reporting free guest pages to QEMU through virtio-balloon so that the memory associated with those pages can be dropped and reused by other processes and/or guests on the host. Using this it is possible to avoid unnecessary I/O to disk and greatly improve performance in the case of memory overcommit on the host. I originally submitted this patch series[1] back on February 11th 2020, but at that time I was focused primarily on the kernel portion of this patch set. However as of April 7th those patches are now included in Linus's kernel tree[2] and so I thought I should resubmit the QEMU side of this for inclusion so that the feature will be available in QEMU hopefully by the time the 5.7 kernel tree is released. [1]: https://lore.kernel.org/lkml/20200211224416.29318.44077.stgit@localhost.localdomain/ [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c504f154718904ae49349147e3b7e6ae91ffdc Changes from v17: Fixed typo in patch 1 title Addressed white-space issues reported via checkpatch Added braces {} for two if statements to match expected coding style --- Alexander Duyck (3): virtio-balloon: Implement support for page poison tracking feature virtio-balloon: Add support for providing free page reports to host virtio-balloon: Provide a interface for free page reporting hw/virtio/virtio-balloon.c | 70 ++++++++++++++++++++++- include/hw/virtio/virtio-balloon.h | 3 + include/standard-headers/linux/virtio_balloon.h | 1 3 files changed, 69 insertions(+), 5 deletions(-) -- ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v18 QEMU 1/3] virtio-balloon: Implement support for page poison tracking feature 2020-04-08 22:55 [PATCH v18 QEMU 0/3] virtio-balloon: add support for providing free page reporting Alexander Duyck @ 2020-04-08 22:55 ` Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting Alexander Duyck 2 siblings, 0 replies; 11+ messages in thread From: Alexander Duyck @ 2020-04-08 22:55 UTC (permalink / raw) To: david, mst; +Cc: virtio-dev, qemu-devel From: Alexander Duyck <alexander.h.duyck@linux.intel.com> We need to make certain to advertise support for page poison tracking if we want to actually get data on if the guest will be poisoning pages. So if free page hinting is active we should add page poisoning support and let the guest disable it if it isn't using it. Page poisoning will result in a page being dirtied on free. As such we cannot really avoid having to copy the page at least one more time since we will need to write the poison value to the destination. As such we can just ignore free page hinting if page poisoning is enabled as it will actually reduce the work we have to do. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> --- hw/virtio/virtio-balloon.c | 26 ++++++++++++++++++++++---- include/hw/virtio/virtio-balloon.h | 1 + 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index a4729f7fc930..1c6d36a29a04 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -531,6 +531,15 @@ static void virtio_balloon_free_page_start(VirtIOBalloon *s) return; } + /* + * If page poisoning is enabled then we probably shouldn't bother with + * the hinting since the poisoning will dirty the page and invalidate + * the work we are doing anyway. + */ + if (virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)) { + return; + } + if (s->free_page_report_cmd_id == UINT_MAX) { s->free_page_report_cmd_id = VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; @@ -618,12 +627,10 @@ static size_t virtio_balloon_config_size(VirtIOBalloon *s) if (s->qemu_4_0_config_size) { return sizeof(struct virtio_balloon_config); } - if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON)) { + if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON) || + virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { return sizeof(struct virtio_balloon_config); } - if (virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { - return offsetof(struct virtio_balloon_config, poison_val); - } return offsetof(struct virtio_balloon_config, free_page_report_cmd_id); } @@ -634,6 +641,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) config.num_pages = cpu_to_le32(dev->num_pages); config.actual = cpu_to_le32(dev->actual); + config.poison_val = cpu_to_le32(dev->poison_val); if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) { config.free_page_report_cmd_id = @@ -697,6 +705,9 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, qapi_event_send_balloon_change(vm_ram_size - ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT)); } + dev->poison_val = virtio_vdev_has_feature(vdev, + VIRTIO_BALLOON_F_PAGE_POISON) ? + le32_to_cpu(config.poison_val) : 0; trace_virtio_balloon_set_config(dev->actual, oldactual); } @@ -706,6 +717,9 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); f |= dev->host_features; virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); + if (virtio_has_feature(f, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON); + } return f; } @@ -854,6 +868,8 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev) g_free(s->stats_vq_elem); s->stats_vq_elem = NULL; } + + s->poison_val = 0; } static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) @@ -916,6 +932,8 @@ static Property virtio_balloon_properties[] = { VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), + DEFINE_PROP_BIT("x-page-poison", VirtIOBalloon, host_features, + VIRTIO_BALLOON_F_PAGE_POISON, false), /* QEMU 4.0 accidentally changed the config size even when free-page-hint * is disabled, resulting in QEMU 3.1 migration incompatibility. This * property retains this quirk for QEMU 4.1 machine types. diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index d1c968d2376e..7fe78e5c14d7 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -70,6 +70,7 @@ typedef struct VirtIOBalloon { uint32_t host_features; bool qemu_4_0_config_size; + uint32_t poison_val; } VirtIOBalloon; #endif ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host 2020-04-08 22:55 [PATCH v18 QEMU 0/3] virtio-balloon: add support for providing free page reporting Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 1/3] virtio-balloon: Implement support for page poison tracking feature Alexander Duyck @ 2020-04-08 22:55 ` Alexander Duyck 2020-04-09 7:35 ` David Hildenbrand 2020-04-08 22:55 ` [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting Alexander Duyck 2 siblings, 1 reply; 11+ messages in thread From: Alexander Duyck @ 2020-04-08 22:55 UTC (permalink / raw) To: david, mst; +Cc: virtio-dev, qemu-devel From: Alexander Duyck <alexander.h.duyck@linux.intel.com> Add support for the page reporting feature provided by virtio-balloon. Reporting differs from the regular balloon functionality in that is is much less durable than a standard memory balloon. Instead of creating a list of pages that cannot be accessed the pages are only inaccessible while they are being indicated to the virtio interface. Once the interface has acknowledged them they are placed back into their respective free lists and are once again accessible by the guest system. Unlike a standard balloon we don't inflate and deflate the pages. Instead we perform the reporting, and once the reporting is completed it is assumed that the page has been dropped from the guest and will be faulted back in the next time the page is accessed. This patch is a subset of the UAPI patch that was submitted for the Linux kernel. The original patch can be found at: https://lore.kernel.org/lkml/20200211224657.29318.68624.stgit@localhost.localdomain/ Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> --- include/standard-headers/linux/virtio_balloon.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h index 9375ca2a70de..1c5f6d6f2de6 100644 --- a/include/standard-headers/linux/virtio_balloon.h +++ b/include/standard-headers/linux/virtio_balloon.h @@ -36,6 +36,7 @@ #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ #define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ #define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ +#define VIRTIO_BALLOON_F_REPORTING 5 /* Page reporting virtqueue */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host 2020-04-08 22:55 ` [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host Alexander Duyck @ 2020-04-09 7:35 ` David Hildenbrand 2020-04-09 14:41 ` Alexander Duyck 0 siblings, 1 reply; 11+ messages in thread From: David Hildenbrand @ 2020-04-09 7:35 UTC (permalink / raw) To: Alexander Duyck, mst; +Cc: virtio-dev, qemu-devel On 09.04.20 00:55, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > Add support for the page reporting feature provided by virtio-balloon. > Reporting differs from the regular balloon functionality in that is is > much less durable than a standard memory balloon. Instead of creating a > list of pages that cannot be accessed the pages are only inaccessible > while they are being indicated to the virtio interface. Once the > interface has acknowledged them they are placed back into their respective > free lists and are once again accessible by the guest system. > > Unlike a standard balloon we don't inflate and deflate the pages. Instead > we perform the reporting, and once the reporting is completed it is > assumed that the page has been dropped from the guest and will be faulted > back in the next time the page is accessed. > > This patch is a subset of the UAPI patch that was submitted for the Linux > kernel. The original patch can be found at: > https://lore.kernel.org/lkml/20200211224657.29318.68624.stgit@localhost.localdomain/ You don't need all these comments. Usually we do "linux-headers: update to contain virito-balloon free page reporting Let's sync the latest upstream changes for free page reporting. To be replaced by a full linux header sync. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> " mst will replace this by a full header sync (if necessary) when sending it upstream -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host 2020-04-09 7:35 ` David Hildenbrand @ 2020-04-09 14:41 ` Alexander Duyck 2020-04-09 14:43 ` David Hildenbrand 0 siblings, 1 reply; 11+ messages in thread From: Alexander Duyck @ 2020-04-09 14:41 UTC (permalink / raw) To: David Hildenbrand; +Cc: virtio-dev, qemu-devel, Michael S. Tsirkin On Thu, Apr 9, 2020 at 12:36 AM David Hildenbrand <david@redhat.com> wrote: > > On 09.04.20 00:55, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > > > Add support for the page reporting feature provided by virtio-balloon. > > Reporting differs from the regular balloon functionality in that is is > > much less durable than a standard memory balloon. Instead of creating a > > list of pages that cannot be accessed the pages are only inaccessible > > while they are being indicated to the virtio interface. Once the > > interface has acknowledged them they are placed back into their respective > > free lists and are once again accessible by the guest system. > > > > Unlike a standard balloon we don't inflate and deflate the pages. Instead > > we perform the reporting, and once the reporting is completed it is > > assumed that the page has been dropped from the guest and will be faulted > > back in the next time the page is accessed. > > > > This patch is a subset of the UAPI patch that was submitted for the Linux > > kernel. The original patch can be found at: > > https://lore.kernel.org/lkml/20200211224657.29318.68624.stgit@localhost.localdomain/ > > You don't need all these comments. Sorry about that. Those are basically the same comments from the original upstream patch. I just wasn't aware of the process so I just copied that over and added the comment/link to the original patch. > Usually we do > > "linux-headers: update to contain virito-balloon free page reporting > > Let's sync the latest upstream changes for free page reporting. To be > replaced by a full linux header sync. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > " > > mst will replace this by a full header sync (if necessary) when sending > it upstream I will update the patch description. Thanks. - Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host 2020-04-09 14:41 ` Alexander Duyck @ 2020-04-09 14:43 ` David Hildenbrand 0 siblings, 0 replies; 11+ messages in thread From: David Hildenbrand @ 2020-04-09 14:43 UTC (permalink / raw) To: Alexander Duyck; +Cc: virtio-dev, qemu-devel, Michael S. Tsirkin On 09.04.20 16:41, Alexander Duyck wrote: > On Thu, Apr 9, 2020 at 12:36 AM David Hildenbrand <david@redhat.com> wrote: >> >> On 09.04.20 00:55, Alexander Duyck wrote: >>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com> >>> >>> Add support for the page reporting feature provided by virtio-balloon. >>> Reporting differs from the regular balloon functionality in that is is >>> much less durable than a standard memory balloon. Instead of creating a >>> list of pages that cannot be accessed the pages are only inaccessible >>> while they are being indicated to the virtio interface. Once the >>> interface has acknowledged them they are placed back into their respective >>> free lists and are once again accessible by the guest system. >>> >>> Unlike a standard balloon we don't inflate and deflate the pages. Instead >>> we perform the reporting, and once the reporting is completed it is >>> assumed that the page has been dropped from the guest and will be faulted >>> back in the next time the page is accessed. >>> >>> This patch is a subset of the UAPI patch that was submitted for the Linux >>> kernel. The original patch can be found at: >>> https://lore.kernel.org/lkml/20200211224657.29318.68624.stgit@localhost.localdomain/ >> >> You don't need all these comments. > > Sorry about that. Those are basically the same comments from the > original upstream patch. I just wasn't aware of the process so I just > copied that over and added the comment/link to the original patch. No worries, it just felt like "oh, he spent a lot of time writing this, but after all it wasn't necessary" :) -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting 2020-04-08 22:55 [PATCH v18 QEMU 0/3] virtio-balloon: add support for providing free page reporting Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 1/3] virtio-balloon: Implement support for page poison tracking feature Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host Alexander Duyck @ 2020-04-08 22:55 ` Alexander Duyck 2020-04-09 7:44 ` David Hildenbrand 2 siblings, 1 reply; 11+ messages in thread From: Alexander Duyck @ 2020-04-08 22:55 UTC (permalink / raw) To: david, mst; +Cc: virtio-dev, qemu-devel From: Alexander Duyck <alexander.h.duyck@linux.intel.com> Add support for what I am referring to as "free page reporting". Basically the idea is to function very similar to how the balloon works in that we basically end up madvising the page as not being used. However we don't really need to bother with any deflate type logic since the page will be faulted back into the guest when it is read or written to. This is meant to be a simplification of the existing balloon interface to use for providing hints to what memory needs to be freed. I am assuming this is safe to do as the deflate logic does not actually appear to do very much other than tracking what subpages have been released and which ones haven't. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> --- hw/virtio/virtio-balloon.c | 48 +++++++++++++++++++++++++++++++++++- include/hw/virtio/virtio-balloon.h | 2 +- 2 files changed, 47 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 1c6d36a29a04..297b267198ac 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -321,6 +321,42 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v, balloon_stats_change_timer(s, 0); } +static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) +{ + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); + VirtQueueElement *elem; + + while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) { + unsigned int i; + + for (i = 0; i < elem->in_num; i++) { + void *addr = elem->in_sg[i].iov_base; + size_t size = elem->in_sg[i].iov_len; + ram_addr_t ram_offset; + size_t rb_page_size; + RAMBlock *rb; + + if (qemu_balloon_is_inhibited() || dev->poison_val) { + continue; + } + + rb = qemu_ram_block_from_host(addr, false, &ram_offset); + rb_page_size = qemu_ram_pagesize(rb); + + /* For now we will simply ignore unaligned memory regions */ + if ((ram_offset | size) & (rb_page_size - 1)) { + continue; + } + + ram_block_discard_range(rb, ram_offset, size); + } + + virtqueue_push(vq, elem, 0); + virtio_notify(vdev, vq); + g_free(elem); + } +} + static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); @@ -628,7 +664,8 @@ static size_t virtio_balloon_config_size(VirtIOBalloon *s) return sizeof(struct virtio_balloon_config); } if (virtio_has_feature(features, VIRTIO_BALLOON_F_PAGE_POISON) || - virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT) || + virtio_has_feature(features, VIRTIO_BALLOON_F_REPORTING)) { return sizeof(struct virtio_balloon_config); } return offsetof(struct virtio_balloon_config, free_page_report_cmd_id); @@ -717,7 +754,8 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); f |= dev->host_features; virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); - if (virtio_has_feature(f, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + if (virtio_has_feature(f, VIRTIO_BALLOON_F_FREE_PAGE_HINT) || + virtio_has_feature(f, VIRTIO_BALLOON_F_REPORTING)) { virtio_add_feature(&f, VIRTIO_BALLOON_F_PAGE_POISON); } @@ -807,6 +845,10 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); + if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_REPORTING)) { + s->rvq = virtio_add_queue(vdev, 32, virtio_balloon_handle_report); + } + if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, @@ -940,6 +982,8 @@ static Property virtio_balloon_properties[] = { */ DEFINE_PROP_BOOL("qemu-4-0-config-size", VirtIOBalloon, qemu_4_0_config_size, false), + DEFINE_PROP_BIT("unused-page-reporting", VirtIOBalloon, host_features, + VIRTIO_BALLOON_F_REPORTING, true), DEFINE_PROP_LINK("iothread", VirtIOBalloon, iothread, TYPE_IOTHREAD, IOThread *), DEFINE_PROP_END_OF_LIST(), diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index 7fe78e5c14d7..db5bf7127112 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -42,7 +42,7 @@ enum virtio_balloon_free_page_report_status { typedef struct VirtIOBalloon { VirtIODevice parent_obj; - VirtQueue *ivq, *dvq, *svq, *free_page_vq; + VirtQueue *ivq, *dvq, *svq, *free_page_vq, *rvq; uint32_t free_page_report_status; uint32_t num_pages; uint32_t actual; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting 2020-04-08 22:55 ` [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting Alexander Duyck @ 2020-04-09 7:44 ` David Hildenbrand 2020-04-09 17:34 ` Alexander Duyck 0 siblings, 1 reply; 11+ messages in thread From: David Hildenbrand @ 2020-04-09 7:44 UTC (permalink / raw) To: Alexander Duyck, mst; +Cc: virtio-dev, qemu-devel On 09.04.20 00:55, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > Add support for what I am referring to as "free page reporting". "Add support for "free page reporting". > Basically the idea is to function very similar to how the balloon works > in that we basically end up madvising the page as not being used. However I'd get rid of one "basically". > we don't really need to bother with any deflate type logic since the page > will be faulted back into the guest when it is read or written to. > > This is meant to be a simplification of the existing balloon interface > to use for providing hints to what memory needs to be freed. I am assuming It's not really a simplification, it's something new. It's a new way of letting the guest automatically report free pages to the hypervisor, so the hypervisor can reuse them. In contrast to inflate/deflate, that's triggered via the hypervisor explicitly. > this is safe to do as the deflate logic does not actually appear to do very > much other than tracking what subpages have been released and which ones > haven't. "I assume this is safe" does not sound very confident. Can we just drop the last sentence? > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > --- > hw/virtio/virtio-balloon.c | 48 +++++++++++++++++++++++++++++++++++- > include/hw/virtio/virtio-balloon.h | 2 +- > 2 files changed, 47 insertions(+), 3 deletions(-) > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > index 1c6d36a29a04..297b267198ac 100644 > --- a/hw/virtio/virtio-balloon.c > +++ b/hw/virtio/virtio-balloon.c > @@ -321,6 +321,42 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v, > balloon_stats_change_timer(s, 0); > } > > +static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) > +{ > + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > + VirtQueueElement *elem; > + > + while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) { > + unsigned int i; > + > + for (i = 0; i < elem->in_num; i++) { > + void *addr = elem->in_sg[i].iov_base; > + size_t size = elem->in_sg[i].iov_len; > + ram_addr_t ram_offset; > + size_t rb_page_size; > + RAMBlock *rb; > + > + if (qemu_balloon_is_inhibited() || dev->poison_val) { > + continue; > + } These checks are not sufficient. See virtio_balloon_handle_output(), where we e.g., check that somebody doesn't try to discard the bios. Maybe we can factor our/unify the handling in both code paths. > + > + rb = qemu_ram_block_from_host(addr, false, &ram_offset); > + rb_page_size = qemu_ram_pagesize(rb); > + > + /* For now we will simply ignore unaligned memory regions */ > + if ((ram_offset | size) & (rb_page_size - 1)) { "!QEMU_IS_ALIGNED()" please to make this easier to read. > + continue; > + } > + > + ram_block_discard_range(rb, ram_offset, size); > + } > + > + virtqueue_push(vq, elem, 0); > + virtio_notify(vdev, vq); > + g_free(elem); > + } > +} > + [...] > if (virtio_has_feature(s->host_features, > VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { > s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, > @@ -940,6 +982,8 @@ static Property virtio_balloon_properties[] = { > */ > DEFINE_PROP_BOOL("qemu-4-0-config-size", VirtIOBalloon, > qemu_4_0_config_size, false), > + DEFINE_PROP_BIT("unused-page-reporting", VirtIOBalloon, host_features, "free-page-reporting" -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting 2020-04-09 7:44 ` David Hildenbrand @ 2020-04-09 17:34 ` Alexander Duyck 2020-04-09 17:35 ` David Hildenbrand 0 siblings, 1 reply; 11+ messages in thread From: Alexander Duyck @ 2020-04-09 17:34 UTC (permalink / raw) To: David Hildenbrand; +Cc: virtio-dev, qemu-devel, Michael S. Tsirkin On Thu, Apr 9, 2020 at 12:44 AM David Hildenbrand <david@redhat.com> wrote: > > On 09.04.20 00:55, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > > > Add support for what I am referring to as "free page reporting". > > "Add support for "free page reporting". > > > Basically the idea is to function very similar to how the balloon works > > in that we basically end up madvising the page as not being used. However > > I'd get rid of one "basically". > > > we don't really need to bother with any deflate type logic since the page > > will be faulted back into the guest when it is read or written to. > > > > This is meant to be a simplification of the existing balloon interface > > to use for providing hints to what memory needs to be freed. I am assuming > > It's not really a simplification, it's something new. It's a new way of > letting the guest automatically report free pages to the hypervisor, so > the hypervisor can reuse them. In contrast to inflate/deflate, that's > triggered via the hypervisor explicitly. > > > this is safe to do as the deflate logic does not actually appear to do very > > much other than tracking what subpages have been released and which ones > > haven't. > > "I assume this is safe" does not sound very confident. Can we just drop > the last sentence? Agreed. I will make the requested changes to the patch description. > > > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > --- > > hw/virtio/virtio-balloon.c | 48 +++++++++++++++++++++++++++++++++++- > > include/hw/virtio/virtio-balloon.h | 2 +- > > 2 files changed, 47 insertions(+), 3 deletions(-) > > > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > > index 1c6d36a29a04..297b267198ac 100644 > > --- a/hw/virtio/virtio-balloon.c > > +++ b/hw/virtio/virtio-balloon.c > > @@ -321,6 +321,42 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v, > > balloon_stats_change_timer(s, 0); > > } > > > > +static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) > > +{ > > + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > > + VirtQueueElement *elem; > > + > > + while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) { > > + unsigned int i; > > + > > + for (i = 0; i < elem->in_num; i++) { > > + void *addr = elem->in_sg[i].iov_base; > > + size_t size = elem->in_sg[i].iov_len; > > + ram_addr_t ram_offset; > > + size_t rb_page_size; > > + RAMBlock *rb; > > + > > + if (qemu_balloon_is_inhibited() || dev->poison_val) { > > + continue; > > + } > > These checks are not sufficient. See virtio_balloon_handle_output(), > where we e.g., check that somebody doesn't try to discard the bios. > > Maybe we can factor our/unify the handling in both code paths. I am going to have to look at this further. If I understand correctly you are asking me to add all of the memory section checks that are in virtio_balloon_handle_output correct? I'm not sure it makes sense with the scatterlist approach I have taken here. All of the mappings were created as a scatterlist of writable DMA mappings unlike the regular balloon which is just stuffing PFN numbers into an array and then sending the array across. As such I would think there are already such protections in place. For instance, you wouldn't want to let virtio-net map the BIOS and request data be written into that either, correct? So I am assuming there is something there to prevent that already isn't there? > > + > > + rb = qemu_ram_block_from_host(addr, false, &ram_offset); > > + rb_page_size = qemu_ram_pagesize(rb); > > + > > + /* For now we will simply ignore unaligned memory regions */ > > + if ((ram_offset | size) & (rb_page_size - 1)) { > > "!QEMU_IS_ALIGNED()" please to make this easier to read. done. > > + continue; > > + } > > + > > + ram_block_discard_range(rb, ram_offset, size); > > + } > > + > > + virtqueue_push(vq, elem, 0); > > + virtio_notify(vdev, vq); > > + g_free(elem); > > + } > > +} > > + > > [...] > > > if (virtio_has_feature(s->host_features, > > VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { > > s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, > > @@ -940,6 +982,8 @@ static Property virtio_balloon_properties[] = { > > */ > > DEFINE_PROP_BOOL("qemu-4-0-config-size", VirtIOBalloon, > > qemu_4_0_config_size, false), > > + DEFINE_PROP_BIT("unused-page-reporting", VirtIOBalloon, host_features, > > "free-page-reporting" Thanks for catching that. It has been a while since that was renamed and I must have let it slip through the cracks. - Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting 2020-04-09 17:34 ` Alexander Duyck @ 2020-04-09 17:35 ` David Hildenbrand 2020-04-10 3:36 ` Alexander Duyck 0 siblings, 1 reply; 11+ messages in thread From: David Hildenbrand @ 2020-04-09 17:35 UTC (permalink / raw) To: Alexander Duyck; +Cc: virtio-dev, qemu-devel, Michael S. Tsirkin On 09.04.20 19:34, Alexander Duyck wrote: > On Thu, Apr 9, 2020 at 12:44 AM David Hildenbrand <david@redhat.com> wrote: >> >> On 09.04.20 00:55, Alexander Duyck wrote: >>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com> >>> >>> Add support for what I am referring to as "free page reporting". >> >> "Add support for "free page reporting". >> >>> Basically the idea is to function very similar to how the balloon works >>> in that we basically end up madvising the page as not being used. However >> >> I'd get rid of one "basically". >> >>> we don't really need to bother with any deflate type logic since the page >>> will be faulted back into the guest when it is read or written to. >>> >>> This is meant to be a simplification of the existing balloon interface >>> to use for providing hints to what memory needs to be freed. I am assuming >> >> It's not really a simplification, it's something new. It's a new way of >> letting the guest automatically report free pages to the hypervisor, so >> the hypervisor can reuse them. In contrast to inflate/deflate, that's >> triggered via the hypervisor explicitly. >> >>> this is safe to do as the deflate logic does not actually appear to do very >>> much other than tracking what subpages have been released and which ones >>> haven't. >> >> "I assume this is safe" does not sound very confident. Can we just drop >> the last sentence? > > Agreed. I will make the requested changes to the patch description. > >> >>> >>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> >>> --- >>> hw/virtio/virtio-balloon.c | 48 +++++++++++++++++++++++++++++++++++- >>> include/hw/virtio/virtio-balloon.h | 2 +- >>> 2 files changed, 47 insertions(+), 3 deletions(-) >>> >>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c >>> index 1c6d36a29a04..297b267198ac 100644 >>> --- a/hw/virtio/virtio-balloon.c >>> +++ b/hw/virtio/virtio-balloon.c >>> @@ -321,6 +321,42 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v, >>> balloon_stats_change_timer(s, 0); >>> } >>> >>> +static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) >>> +{ >>> + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); >>> + VirtQueueElement *elem; >>> + >>> + while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) { >>> + unsigned int i; >>> + >>> + for (i = 0; i < elem->in_num; i++) { >>> + void *addr = elem->in_sg[i].iov_base; >>> + size_t size = elem->in_sg[i].iov_len; >>> + ram_addr_t ram_offset; >>> + size_t rb_page_size; >>> + RAMBlock *rb; >>> + >>> + if (qemu_balloon_is_inhibited() || dev->poison_val) { >>> + continue; >>> + } >> >> These checks are not sufficient. See virtio_balloon_handle_output(), >> where we e.g., check that somebody doesn't try to discard the bios. >> >> Maybe we can factor our/unify the handling in both code paths. > > I am going to have to look at this further. If I understand correctly > you are asking me to add all of the memory section checks that are in > virtio_balloon_handle_output correct? > > I'm not sure it makes sense with the scatterlist approach I have taken > here. All of the mappings were created as a scatterlist of writable > DMA mappings unlike the regular balloon which is just stuffing PFN > numbers into an array and then sending the array across. As such I > would think there are already such protections in place. For instance, > you wouldn't want to let virtio-net map the BIOS and request data be > written into that either, correct? So I am assuming there is something > there to prevent that already isn't there? Good point, let's find out :) -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting 2020-04-09 17:35 ` David Hildenbrand @ 2020-04-10 3:36 ` Alexander Duyck 0 siblings, 0 replies; 11+ messages in thread From: Alexander Duyck @ 2020-04-10 3:36 UTC (permalink / raw) To: David Hildenbrand, Paolo Bonzini Cc: virtio-dev, qemu-devel, Michael S. Tsirkin On Thu, Apr 9, 2020 at 10:35 AM David Hildenbrand <david@redhat.com> wrote: > On 09.04.20 19:34, Alexander Duyck wrote: > >>> hw/virtio/virtio-balloon.c | 48 +++++++++++++++++++++++++++++++++++- > >>> include/hw/virtio/virtio-balloon.h | 2 +- > >>> 2 files changed, 47 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > >>> index 1c6d36a29a04..297b267198ac 100644 > >>> --- a/hw/virtio/virtio-balloon.c > >>> +++ b/hw/virtio/virtio-balloon.c > >>> @@ -321,6 +321,42 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v, > >>> balloon_stats_change_timer(s, 0); > >>> } > >>> > >>> +static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) > >>> +{ > >>> + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > >>> + VirtQueueElement *elem; > >>> + > >>> + while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) { > >>> + unsigned int i; > >>> + > >>> + for (i = 0; i < elem->in_num; i++) { > >>> + void *addr = elem->in_sg[i].iov_base; > >>> + size_t size = elem->in_sg[i].iov_len; > >>> + ram_addr_t ram_offset; > >>> + size_t rb_page_size; > >>> + RAMBlock *rb; > >>> + > >>> + if (qemu_balloon_is_inhibited() || dev->poison_val) { > >>> + continue; > >>> + } > >> > >> These checks are not sufficient. See virtio_balloon_handle_output(), > >> where we e.g., check that somebody doesn't try to discard the bios. > >> > >> Maybe we can factor our/unify the handling in both code paths. > > > > I am going to have to look at this further. If I understand correctly > > you are asking me to add all of the memory section checks that are in > > virtio_balloon_handle_output correct? > > > > I'm not sure it makes sense with the scatterlist approach I have taken > > here. All of the mappings were created as a scatterlist of writable > > DMA mappings unlike the regular balloon which is just stuffing PFN > > numbers into an array and then sending the array across. As such I > > would think there are already such protections in place. For instance, > > you wouldn't want to let virtio-net map the BIOS and request data be > > written into that either, correct? So I am assuming there is something > > there to prevent that already isn't there? > > Good point, let's find out :) Okay, so I believe I figured it out. From what I can tell there is a call in address_space_map that will determine if we can directly write to the page or not. However it looks like there might be one minor issue as it is assuming it can write directly to ROM devices and that isn't correct. I will add a patch to my set to address that. Other than that the behavior seems to be that a bounce buffer will be allocated on the first instance of a page backed by ROM or anything other than RAM, and after that it will return NULL until the bounce buffer is freed. If we start getting NULLs the mapping will be aborted and we wouldn't even get into this code path. After we unmap the region it will attempt to use address_space_write which should fail for any region that isn't meant to be written to, or it will copy zeros out of the bounce buffer into the region if it is writable via address_space_write. So the DMA mapping API in virtqueue_pop/virtqueue_push will take care of doing the right stuff for the mappings in the case that the guest is trying to do something really stupid, at least after I address the direct write access to rom_device regions. - Alex ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-04-10 3:37 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-04-08 22:55 [PATCH v18 QEMU 0/3] virtio-balloon: add support for providing free page reporting Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 1/3] virtio-balloon: Implement support for page poison tracking feature Alexander Duyck 2020-04-08 22:55 ` [PATCH v18 QEMU 2/3] virtio-balloon: Add support for providing free page reports to host Alexander Duyck 2020-04-09 7:35 ` David Hildenbrand 2020-04-09 14:41 ` Alexander Duyck 2020-04-09 14:43 ` David Hildenbrand 2020-04-08 22:55 ` [PATCH v18 QEMU 3/3] virtio-balloon: Provide a interface for free page reporting Alexander Duyck 2020-04-09 7:44 ` David Hildenbrand 2020-04-09 17:34 ` Alexander Duyck 2020-04-09 17:35 ` David Hildenbrand 2020-04-10 3:36 ` Alexander Duyck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).