qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Tony Krowiak <akrowiak@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>, qemu-devel@nongnu.org
Cc: Eric Farman <farman@linux.ibm.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	kvm@vger.kernel.org, "Michael S . Tsirkin" <mst@redhat.com>,
	Pierre Morel <pmorel@linux.ibm.com>,
	Cornelia Huck <cohuck@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	qemu-s390x@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [PATCH v4 02/21] vfio: Convert to ram_block_discard_disable()
Date: Wed, 10 Jun 2020 09:04:51 -0400	[thread overview]
Message-ID: <8c71bfda-e958-56f8-ddaf-6a831fff2bc6@linux.ibm.com> (raw)
In-Reply-To: <20200610115419.51688-3-david@redhat.com>



On 6/10/20 7:54 AM, David Hildenbrand wrote:
> VFIO is (except devices without a physical IOMMU or some mediated devices)
> incompatible with discarding of RAM. The kernel will pin basically all VM
> memory. Let's convert to ram_block_discard_disable(), which can now
> fail, in contrast to qemu_balloon_inhibit().
>
> Leave "x-balloon-allowed" named as it is for now.
>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Tony Krowiak <akrowiak@linux.ibm.com>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Pierre Morel <pmorel@linux.ibm.com>
> Cc: Eric Farman <farman@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

See my two minor comments, other than that:
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>

> ---
>   hw/vfio/ap.c                  | 10 +++----
>   hw/vfio/ccw.c                 | 11 ++++----
>   hw/vfio/common.c              | 53 +++++++++++++++++++----------------
>   hw/vfio/pci.c                 |  6 ++--
>   include/hw/vfio/vfio-common.h |  4 +--
>   5 files changed, 45 insertions(+), 39 deletions(-)
>
> diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
> index 95564c17ed..d0b1bc7581 100644
> --- a/hw/vfio/ap.c
> +++ b/hw/vfio/ap.c
> @@ -105,12 +105,12 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp)
>       vapdev->vdev.dev = dev;
>   
>       /*
> -     * vfio-ap devices operate in a way compatible with
> -     * memory ballooning, as no pages are pinned in the host.
> -     * This needs to be set before vfio_get_device() for vfio common to
> -     * handle the balloon inhibitor.
> +     * vfio-ap devices operate in a way compatible discarding of memory in

s/compatible discarding/compatible with discarding/?

> +     * RAM blocks, as no pages are pinned in the host. This needs to be
> +     * set before vfio_get_device() for vfio common to handle
> +     * ram_block_discard_disable().
>        */
> -    vapdev->vdev.balloon_allowed = true;
> +    vapdev->vdev.ram_block_discard_allowed = true;
>   
>       ret = vfio_get_device(vfio_group, mdevid, &vapdev->vdev, errp);
>       if (ret) {
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index 63406184d2..82857f1615 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -418,12 +418,13 @@ static void vfio_ccw_get_device(VFIOGroup *group, VFIOCCWDevice *vcdev,
>   
>       /*
>        * All vfio-ccw devices are believed to operate in a way compatible with
> -     * memory ballooning, ie. pages pinned in the host are in the current
> -     * working set of the guest driver and therefore never overlap with pages
> -     * available to the guest balloon driver.  This needs to be set before
> -     * vfio_get_device() for vfio common to handle the balloon inhibitor.
> +     * discarding of memory in RAM blocks, ie. pages pinned in the host are
> +     * in the current working set of the guest driver and therefore never
> +     * overlap e.g., with pages available to the guest balloon driver.  This
> +     * needs to be set before vfio_get_device() for vfio common to handle
> +     * ram_block_discard_disable().
>        */
> -    vcdev->vdev.balloon_allowed = true;
> +    vcdev->vdev.ram_block_discard_allowed = true;
>   
>       if (vfio_get_device(group, vcdev->cdev.mdevid, &vcdev->vdev, errp)) {
>           goto out_err;
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 0b3593b3c0..33357140b8 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -33,7 +33,6 @@
>   #include "qemu/error-report.h"
>   #include "qemu/main-loop.h"
>   #include "qemu/range.h"
> -#include "sysemu/balloon.h"
>   #include "sysemu/kvm.h"
>   #include "sysemu/reset.h"
>   #include "trace.h"
> @@ -1215,31 +1214,36 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>       space = vfio_get_address_space(as);
>   
>       /*
> -     * VFIO is currently incompatible with memory ballooning insofar as the
> +     * VFIO is currently incompatible with discarding of RAM insofar as the
>        * madvise to purge (zap) the page from QEMU's address space does not
>        * interact with the memory API and therefore leaves stale virtual to
>        * physical mappings in the IOMMU if the page was previously pinned.  We
> -     * therefore add a balloon inhibit for each group added to a container,
> +     * therefore set discarding broken for each group added to a container,
>        * whether the container is used individually or shared.  This provides
>        * us with options to allow devices within a group to opt-in and allow
> -     * ballooning, so long as it is done consistently for a group (for instance
> +     * discarding, so long as it is done consistently for a group (for instance
>        * if the device is an mdev device where it is known that the host vendor
>        * driver will never pin pages outside of the working set of the guest
> -     * driver, which would thus not be ballooning candidates).
> +     * driver, which would thus not be discarding candidates).
>        *
>        * The first opportunity to induce pinning occurs here where we attempt to
>        * attach the group to existing containers within the AddressSpace.  If any
> -     * pages are already zapped from the virtual address space, such as from a
> -     * previous ballooning opt-in, new pinning will cause valid mappings to be
> +     * pages are already zapped from the virtual address space, such as from
> +     * previous discards, new pinning will cause valid mappings to be
>        * re-established.  Likewise, when the overall MemoryListener for a new
>        * container is registered, a replay of mappings within the AddressSpace
>        * will occur, re-establishing any previously zapped pages as well.
>        *
> -     * NB. Balloon inhibiting does not currently block operation of the
> -     * balloon driver or revoke previously pinned pages, it only prevents
> -     * calling madvise to modify the virtual mapping of ballooned pages.
> +     * Especially virtio-balloon is currently only prevented from discarding
> +     * new memory, it will not yet set ram_block_discard_set_required() and
> +     * therefore, neither stops us here or deals with the sudden memory
> +     * consumption of inflated memory.
>        */
> -    qemu_balloon_inhibit(true);
> +    ret = ram_block_discard_disable(true);
> +    if (ret) {
> +        error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken");
> +        return ret;
> +    }
>   
>       QLIST_FOREACH(container, &space->containers, next) {
>           if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
> @@ -1405,7 +1409,7 @@ close_fd_exit:
>       close(fd);
>   
>   put_space_exit:
> -    qemu_balloon_inhibit(false);
> +    ram_block_discard_disable(false);
>       vfio_put_address_space(space);
>   
>       return ret;
> @@ -1526,8 +1530,8 @@ void vfio_put_group(VFIOGroup *group)
>           return;
>       }
>   
> -    if (!group->balloon_allowed) {
> -        qemu_balloon_inhibit(false);
> +    if (!group->ram_block_discard_allowed) {
> +        ram_block_discard_disable(false);
>       }
>       vfio_kvm_device_del_group(group);
>       vfio_disconnect_container(group);
> @@ -1565,22 +1569,23 @@ int vfio_get_device(VFIOGroup *group, const char *name,
>       }
>   
>       /*
> -     * Clear the balloon inhibitor for this group if the driver knows the
> -     * device operates compatibly with ballooning.  Setting must be consistent
> -     * per group, but since compatibility is really only possible with mdev
> -     * currently, we expect singleton groups.
> +     * Set discarding of RAM as not broken for this group if the driver knows
> +     * the device operates compatibly with discarding.  Setting must be
> +     * consistent per group, but since compatibility is really only possible
> +     * with mdev currently, we expect singleton groups.
>        */
> -    if (vbasedev->balloon_allowed != group->balloon_allowed) {
> +    if (vbasedev->ram_block_discard_allowed !=
> +        group->ram_block_discard_allowed) {
>           if (!QLIST_EMPTY(&group->device_list)) {
> -            error_setg(errp,
> -                       "Inconsistent device balloon setting within group");
> +            error_setg(errp, "Inconsistent setting of support for discarding "
> +                       "RAM (e.g., balloon) within group");
>               close(fd);
>               return -1;
>           }
>   
> -        if (!group->balloon_allowed) {
> -            group->balloon_allowed = true;
> -            qemu_balloon_inhibit(false);
> +        if (!group->ram_block_discard_allowed) {
> +            group->ram_block_discard_allowed = true;
> +            ram_block_discard_disable(false);
>           }
>       }
>   
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 342dd6b912..c33c11b7e4 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2796,7 +2796,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>       }
>   
>       /*
> -     * Mediated devices *might* operate compatibly with memory ballooning, but
> +     * Mediated devices *might* operate compatibly with discarding of RAM, but
>        * we cannot know for certain, it depends on whether the mdev vendor driver
>        * stays in sync with the active working set of the guest driver.  Prevent
>        * the x-balloon-allowed option unless this is minimally an mdev device.
> @@ -2809,7 +2809,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>   
>       trace_vfio_mdev(vdev->vbasedev.name, is_mdev);
>   
> -    if (vdev->vbasedev.balloon_allowed && !is_mdev) {
> +    if (vdev->vbasedev.ram_block_discard_allowed && !is_mdev) {
>           error_setg(errp, "x-balloon-allowed only potentially compatible "
>                      "with mdev devices");

Should this error message be changed?

>           vfio_put_group(group);
> @@ -3163,7 +3163,7 @@ static Property vfio_pci_dev_properties[] = {
>                       VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
>       DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
>       DEFINE_PROP_BOOL("x-balloon-allowed", VFIOPCIDevice,
> -                     vbasedev.balloon_allowed, false),
> +                     vbasedev.ram_block_discard_allowed, false),
>       DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
>       DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
>       DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index fd564209ac..c78f3ff559 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -108,7 +108,7 @@ typedef struct VFIODevice {
>       bool reset_works;
>       bool needs_reset;
>       bool no_mmap;
> -    bool balloon_allowed;
> +    bool ram_block_discard_allowed;
>       VFIODeviceOps *ops;
>       unsigned int num_irqs;
>       unsigned int num_regions;
> @@ -128,7 +128,7 @@ typedef struct VFIOGroup {
>       QLIST_HEAD(, VFIODevice) device_list;
>       QLIST_ENTRY(VFIOGroup) next;
>       QLIST_ENTRY(VFIOGroup) container_next;
> -    bool balloon_allowed;
> +    bool ram_block_discard_allowed;
>   } VFIOGroup;
>   
>   typedef struct VFIODMABuf {



  reply	other threads:[~2020-06-10 13:06 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-10 11:53 [PATCH v4 00/21] virtio-mem: Paravirtualized memory hot(un)plug David Hildenbrand
2020-06-10 11:53 ` [PATCH v4 01/21] exec: Introduce ram_block_discard_(disable|require)() David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 02/21] vfio: Convert to ram_block_discard_disable() David Hildenbrand
2020-06-10 13:04   ` Tony Krowiak [this message]
2020-06-10 14:13     ` David Hildenbrand
2020-06-16 11:15   ` Cornelia Huck
2020-06-10 11:54 ` [PATCH v4 03/21] accel/kvm: " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 04/21] s390x/pv: " David Hildenbrand
2020-06-16 11:17   ` Cornelia Huck
2020-06-10 11:54 ` [PATCH v4 05/21] virtio-balloon: Rip out qemu_balloon_inhibit() David Hildenbrand
2020-06-16 10:56   ` Dr. David Alan Gilbert
2020-06-24 15:32   ` Michael S. Tsirkin
2020-06-10 11:54 ` [PATCH v4 06/21] target/i386: sev: Use ram_block_discard_disable() David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 07/21] migration/rdma: " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 08/21] migration/colo: " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 09/21] linux-headers: update to contain virtio-mem David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 10/21] virtio-mem: Paravirtualized memory hot(un)plug David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 11/21] virtio-pci: Proxy for virtio-mem David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 12/21] MAINTAINERS: Add myself as virtio-mem maintainer David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 13/21] hmp: Handle virtio-mem when printing memory device info David Hildenbrand
2020-06-17 17:53   ` Dr. David Alan Gilbert
2020-06-17 17:54   ` Dr. David Alan Gilbert
2020-06-10 11:54 ` [PATCH v4 14/21] numa: Handle virtio-mem in NUMA stats David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 15/21] pc: Support for virtio-mem-pci David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 16/21] virtio-mem: Allow notifiers for size changes David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 17/21] virtio-pci: Send qapi events when the virtio-mem " David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 18/21] virtio-mem: Migration sanity checks David Hildenbrand
2020-06-17 17:59   ` Dr. David Alan Gilbert
2020-06-18 10:39     ` David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 19/21] virtio-mem: Add trace events David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 20/21] virtio-mem: Exclude unplugged memory during migration David Hildenbrand
2020-06-10 11:54 ` [PATCH v4 21/21] numa: Auto-enable NUMA when any memory devices are possible David Hildenbrand
2020-06-24 15:33 ` [PATCH v4 00/21] virtio-mem: Paravirtualized memory hot(un)plug Michael S. Tsirkin
2020-06-24 15:40   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c71bfda-e958-56f8-ddaf-6a831fff2bc6@linux.ibm.com \
    --to=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=pmorel@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).