From: Tony Krowiak <akrowiak@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>, qemu-devel@nongnu.org
Cc: Eric Farman <farman@linux.ibm.com>,
Alex Williamson <alex.williamson@redhat.com>,
Eduardo Habkost <ehabkost@redhat.com>,
kvm@vger.kernel.org, "Michael S . Tsirkin" <mst@redhat.com>,
Pierre Morel <pmorel@linux.ibm.com>,
Cornelia Huck <cohuck@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Halil Pasic <pasic@linux.ibm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
qemu-s390x@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>
Subject: Re: [PATCH v4 02/21] vfio: Convert to ram_block_discard_disable()
Date: Wed, 10 Jun 2020 09:04:51 -0400 [thread overview]
Message-ID: <8c71bfda-e958-56f8-ddaf-6a831fff2bc6@linux.ibm.com> (raw)
In-Reply-To: <20200610115419.51688-3-david@redhat.com>
On 6/10/20 7:54 AM, David Hildenbrand wrote:
> VFIO is (except devices without a physical IOMMU or some mediated devices)
> incompatible with discarding of RAM. The kernel will pin basically all VM
> memory. Let's convert to ram_block_discard_disable(), which can now
> fail, in contrast to qemu_balloon_inhibit().
>
> Leave "x-balloon-allowed" named as it is for now.
>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Tony Krowiak <akrowiak@linux.ibm.com>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Pierre Morel <pmorel@linux.ibm.com>
> Cc: Eric Farman <farman@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
See my two minor comments, other than that:
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
> hw/vfio/ap.c | 10 +++----
> hw/vfio/ccw.c | 11 ++++----
> hw/vfio/common.c | 53 +++++++++++++++++++----------------
> hw/vfio/pci.c | 6 ++--
> include/hw/vfio/vfio-common.h | 4 +--
> 5 files changed, 45 insertions(+), 39 deletions(-)
>
> diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
> index 95564c17ed..d0b1bc7581 100644
> --- a/hw/vfio/ap.c
> +++ b/hw/vfio/ap.c
> @@ -105,12 +105,12 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp)
> vapdev->vdev.dev = dev;
>
> /*
> - * vfio-ap devices operate in a way compatible with
> - * memory ballooning, as no pages are pinned in the host.
> - * This needs to be set before vfio_get_device() for vfio common to
> - * handle the balloon inhibitor.
> + * vfio-ap devices operate in a way compatible discarding of memory in
s/compatible discarding/compatible with discarding/?
> + * RAM blocks, as no pages are pinned in the host. This needs to be
> + * set before vfio_get_device() for vfio common to handle
> + * ram_block_discard_disable().
> */
> - vapdev->vdev.balloon_allowed = true;
> + vapdev->vdev.ram_block_discard_allowed = true;
>
> ret = vfio_get_device(vfio_group, mdevid, &vapdev->vdev, errp);
> if (ret) {
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index 63406184d2..82857f1615 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -418,12 +418,13 @@ static void vfio_ccw_get_device(VFIOGroup *group, VFIOCCWDevice *vcdev,
>
> /*
> * All vfio-ccw devices are believed to operate in a way compatible with
> - * memory ballooning, ie. pages pinned in the host are in the current
> - * working set of the guest driver and therefore never overlap with pages
> - * available to the guest balloon driver. This needs to be set before
> - * vfio_get_device() for vfio common to handle the balloon inhibitor.
> + * discarding of memory in RAM blocks, ie. pages pinned in the host are
> + * in the current working set of the guest driver and therefore never
> + * overlap e.g., with pages available to the guest balloon driver. This
> + * needs to be set before vfio_get_device() for vfio common to handle
> + * ram_block_discard_disable().
> */
> - vcdev->vdev.balloon_allowed = true;
> + vcdev->vdev.ram_block_discard_allowed = true;
>
> if (vfio_get_device(group, vcdev->cdev.mdevid, &vcdev->vdev, errp)) {
> goto out_err;
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 0b3593b3c0..33357140b8 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -33,7 +33,6 @@
> #include "qemu/error-report.h"
> #include "qemu/main-loop.h"
> #include "qemu/range.h"
> -#include "sysemu/balloon.h"
> #include "sysemu/kvm.h"
> #include "sysemu/reset.h"
> #include "trace.h"
> @@ -1215,31 +1214,36 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
> space = vfio_get_address_space(as);
>
> /*
> - * VFIO is currently incompatible with memory ballooning insofar as the
> + * VFIO is currently incompatible with discarding of RAM insofar as the
> * madvise to purge (zap) the page from QEMU's address space does not
> * interact with the memory API and therefore leaves stale virtual to
> * physical mappings in the IOMMU if the page was previously pinned. We
> - * therefore add a balloon inhibit for each group added to a container,
> + * therefore set discarding broken for each group added to a container,
> * whether the container is used individually or shared. This provides
> * us with options to allow devices within a group to opt-in and allow
> - * ballooning, so long as it is done consistently for a group (for instance
> + * discarding, so long as it is done consistently for a group (for instance
> * if the device is an mdev device where it is known that the host vendor
> * driver will never pin pages outside of the working set of the guest
> - * driver, which would thus not be ballooning candidates).
> + * driver, which would thus not be discarding candidates).
> *
> * The first opportunity to induce pinning occurs here where we attempt to
> * attach the group to existing containers within the AddressSpace. If any
> - * pages are already zapped from the virtual address space, such as from a
> - * previous ballooning opt-in, new pinning will cause valid mappings to be
> + * pages are already zapped from the virtual address space, such as from
> + * previous discards, new pinning will cause valid mappings to be
> * re-established. Likewise, when the overall MemoryListener for a new
> * container is registered, a replay of mappings within the AddressSpace
> * will occur, re-establishing any previously zapped pages as well.
> *
> - * NB. Balloon inhibiting does not currently block operation of the
> - * balloon driver or revoke previously pinned pages, it only prevents
> - * calling madvise to modify the virtual mapping of ballooned pages.
> + * Especially virtio-balloon is currently only prevented from discarding
> + * new memory, it will not yet set ram_block_discard_set_required() and
> + * therefore, neither stops us here or deals with the sudden memory
> + * consumption of inflated memory.
> */
> - qemu_balloon_inhibit(true);
> + ret = ram_block_discard_disable(true);
> + if (ret) {
> + error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken");
> + return ret;
> + }
>
> QLIST_FOREACH(container, &space->containers, next) {
> if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
> @@ -1405,7 +1409,7 @@ close_fd_exit:
> close(fd);
>
> put_space_exit:
> - qemu_balloon_inhibit(false);
> + ram_block_discard_disable(false);
> vfio_put_address_space(space);
>
> return ret;
> @@ -1526,8 +1530,8 @@ void vfio_put_group(VFIOGroup *group)
> return;
> }
>
> - if (!group->balloon_allowed) {
> - qemu_balloon_inhibit(false);
> + if (!group->ram_block_discard_allowed) {
> + ram_block_discard_disable(false);
> }
> vfio_kvm_device_del_group(group);
> vfio_disconnect_container(group);
> @@ -1565,22 +1569,23 @@ int vfio_get_device(VFIOGroup *group, const char *name,
> }
>
> /*
> - * Clear the balloon inhibitor for this group if the driver knows the
> - * device operates compatibly with ballooning. Setting must be consistent
> - * per group, but since compatibility is really only possible with mdev
> - * currently, we expect singleton groups.
> + * Set discarding of RAM as not broken for this group if the driver knows
> + * the device operates compatibly with discarding. Setting must be
> + * consistent per group, but since compatibility is really only possible
> + * with mdev currently, we expect singleton groups.
> */
> - if (vbasedev->balloon_allowed != group->balloon_allowed) {
> + if (vbasedev->ram_block_discard_allowed !=
> + group->ram_block_discard_allowed) {
> if (!QLIST_EMPTY(&group->device_list)) {
> - error_setg(errp,
> - "Inconsistent device balloon setting within group");
> + error_setg(errp, "Inconsistent setting of support for discarding "
> + "RAM (e.g., balloon) within group");
> close(fd);
> return -1;
> }
>
> - if (!group->balloon_allowed) {
> - group->balloon_allowed = true;
> - qemu_balloon_inhibit(false);
> + if (!group->ram_block_discard_allowed) {
> + group->ram_block_discard_allowed = true;
> + ram_block_discard_disable(false);
> }
> }
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 342dd6b912..c33c11b7e4 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2796,7 +2796,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
> }
>
> /*
> - * Mediated devices *might* operate compatibly with memory ballooning, but
> + * Mediated devices *might* operate compatibly with discarding of RAM, but
> * we cannot know for certain, it depends on whether the mdev vendor driver
> * stays in sync with the active working set of the guest driver. Prevent
> * the x-balloon-allowed option unless this is minimally an mdev device.
> @@ -2809,7 +2809,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>
> trace_vfio_mdev(vdev->vbasedev.name, is_mdev);
>
> - if (vdev->vbasedev.balloon_allowed && !is_mdev) {
> + if (vdev->vbasedev.ram_block_discard_allowed && !is_mdev) {
> error_setg(errp, "x-balloon-allowed only potentially compatible "
> "with mdev devices");
Should this error message be changed?
> vfio_put_group(group);
> @@ -3163,7 +3163,7 @@ static Property vfio_pci_dev_properties[] = {
> VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
> DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> DEFINE_PROP_BOOL("x-balloon-allowed", VFIOPCIDevice,
> - vbasedev.balloon_allowed, false),
> + vbasedev.ram_block_discard_allowed, false),
> DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
> DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
> DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index fd564209ac..c78f3ff559 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -108,7 +108,7 @@ typedef struct VFIODevice {
> bool reset_works;
> bool needs_reset;
> bool no_mmap;
> - bool balloon_allowed;
> + bool ram_block_discard_allowed;
> VFIODeviceOps *ops;
> unsigned int num_irqs;
> unsigned int num_regions;
> @@ -128,7 +128,7 @@ typedef struct VFIOGroup {
> QLIST_HEAD(, VFIODevice) device_list;
> QLIST_ENTRY(VFIOGroup) next;
> QLIST_ENTRY(VFIOGroup) container_next;
> - bool balloon_allowed;
> + bool ram_block_discard_allowed;
> } VFIOGroup;
>
> typedef struct VFIODMABuf {
next prev parent reply other threads:[~2020-06-10 13:06 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-10 11:53 [PATCH v4 00/21] virtio-mem: Paravirtualized memory hot(un)plug David Hildenbrand
2020-06-10 11:53 ` [PATCH v4 01/21] exec: Introduce ram_block_discard_(disable|require)() David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 02/21] vfio: Convert to ram_block_discard_disable() David Hildenbrand
2020-06-10 13:04 ` Tony Krowiak [this message]
2020-06-10 14:13 ` David Hildenbrand
2020-06-16 11:15 ` Cornelia Huck
2020-06-10 11:54 ` [PATCH v4 03/21] accel/kvm: " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 04/21] s390x/pv: " David Hildenbrand
2020-06-16 11:17 ` Cornelia Huck
2020-06-10 11:54 ` [PATCH v4 05/21] virtio-balloon: Rip out qemu_balloon_inhibit() David Hildenbrand
2020-06-16 10:56 ` Dr. David Alan Gilbert
2020-06-24 15:32 ` Michael S. Tsirkin
2020-06-10 11:54 ` [PATCH v4 06/21] target/i386: sev: Use ram_block_discard_disable() David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 07/21] migration/rdma: " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 08/21] migration/colo: " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 09/21] linux-headers: update to contain virtio-mem David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 10/21] virtio-mem: Paravirtualized memory hot(un)plug David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 11/21] virtio-pci: Proxy for virtio-mem David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 12/21] MAINTAINERS: Add myself as virtio-mem maintainer David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 13/21] hmp: Handle virtio-mem when printing memory device info David Hildenbrand
2020-06-17 17:53 ` Dr. David Alan Gilbert
2020-06-17 17:54 ` Dr. David Alan Gilbert
2020-06-10 11:54 ` [PATCH v4 14/21] numa: Handle virtio-mem in NUMA stats David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 15/21] pc: Support for virtio-mem-pci David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 16/21] virtio-mem: Allow notifiers for size changes David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 17/21] virtio-pci: Send qapi events when the virtio-mem " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 18/21] virtio-mem: Migration sanity checks David Hildenbrand
2020-06-17 17:59 ` Dr. David Alan Gilbert
2020-06-18 10:39 ` David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 19/21] virtio-mem: Add trace events David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 20/21] virtio-mem: Exclude unplugged memory during migration David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 21/21] numa: Auto-enable NUMA when any memory devices are possible David Hildenbrand
2020-06-24 15:33 ` [PATCH v4 00/21] virtio-mem: Paravirtualized memory hot(un)plug Michael S. Tsirkin
2020-06-24 15:40 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8c71bfda-e958-56f8-ddaf-6a831fff2bc6@linux.ibm.com \
--to=akrowiak@linux.ibm.com \
--cc=alex.williamson@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=farman@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=pmorel@linux.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).