qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michal Prívozník" <mprivozn@redhat.com>
To: David Hildenbrand <david@redhat.com>, qemu-devel@nongnu.org
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	Gavin Shan <gshan@redhat.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Richard Henderson <richard.henderson@linaro.org>,
	Sebastien Boeuf <sebastien.boeuf@intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Hui Zhu <teawater@gmail.com>
Subject: Re: [PATCH v1 2/3] virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
Date: Tue, 30 Nov 2021 16:34:12 +0100	[thread overview]
Message-ID: <ebafe902-b345-1b07-e77f-56289816a542@redhat.com> (raw)
In-Reply-To: <20211130092838.24224-3-david@redhat.com>

On 11/30/21 10:28, David Hildenbrand wrote:
> With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading
> unplugged memory is not supported. We have to fail feature negotiation
> in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.
> 
> First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle
> memory backends (or architectures) without support for the shared zeropage
> in the hypervisor cleanly. Without the shared zeropage, even reading an
> unpopulated virtual memory location can populate real memory and
> consequently consume memory in the hypervisor. We have a guaranteed shared
> zeropage only on MAP_PRIVATE anonymous memory.
> 
> Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default
> long-term as even populating the shared zeropage can be problematic: for
> example, without THP support (possible) or without support for the shared
> huge zeropage with THP (unlikely), the PTE page tables to hold the shared
> zeropage entries can consume quite some memory that cannot be reclaimed
> easily.
> 
> Third, there are other optimizations+features (e.g., protection of
> unplugged memory, reducing the total memory slot size and bitmap sizes)
> that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.
> 
> We really only support x86 targets with virtio-mem for now (and
> Linux similarly only support x86), but that might change soon, so prepare
> for different targets already.
> 
> Add a new "unplugged-inaccessible" tristate property for x86 targets:
> - "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy
>   guests working.
> - "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests
>   from using the device.
> - "auto" selects the default based on support for the shared zeropage.
> 
> Warn in case the property is set to "off" and we don't have support for the
> shared zeropage.
> 
> For existing compat machines, the property will default to "off", to
> not change the behavior but eventually warn about a problematic setup.
> Short-term, we'll set the property default to "auto" for new QEMU machines.
> Mid-term, we'll set the property default to "on" for new QEMU machines.
> Long-term, we'll deprecate the parameter and disallow legacy
> guests completely.
> 
> The property has to match on the migration source and destination. "auto"
> will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long
> as the qemu command line (esp. memdev) match -- so "auto" is good enough
> for migration purposes and the parameter doesn't have to be migrated
> explicitly.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/virtio/virtio-mem.c         | 63 ++++++++++++++++++++++++++++++++++
>  include/hw/virtio/virtio-mem.h |  8 +++++
>  2 files changed, 71 insertions(+)
> 
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index d5a578142b..1e57156e81 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -32,6 +32,14 @@
>  #include CONFIG_DEVICES
>  #include "trace.h"
>  
> +/*
> + * We only had legacy x86 guests that did not support
> + * VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE. Other targets don't have legacy guests.
> + */
> +#if defined(TARGET_X86_64) || defined(TARGET_I386)
> +#define VIRTIO_MEM_HAS_LEGACY_GUESTS
> +#endif
> +
>  /*
>   * Let's not allow blocks smaller than 1 MiB, for example, to keep the tracking
>   * bitmap small.
> @@ -110,6 +118,19 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
>      return MAX(page_size, VIRTIO_MEM_MIN_BLOCK_SIZE);
>  }
>  
> +#if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS)
> +static bool virtio_mem_has_shared_zeropage(RAMBlock *rb)
> +{
> +    /*
> +     * We only have a guaranteed shared zeropage on ordinary MAP_PRIVATE
> +     * anonymous RAM. In any other case, reading unplugged *can* populate a
> +     * fresh page, consuming actual memory.
> +     */
> +    return !qemu_ram_is_shared(rb) && rb->fd < 0 &&
> +           qemu_ram_pagesize(rb) == qemu_real_host_page_size;
> +}
> +#endif /* VIRTIO_MEM_HAS_LEGACY_GUESTS */
> +
>  /*
>   * Size the usable region bigger than the requested size if possible. Esp.
>   * Linux guests will only add (aligned) memory blocks in case they fully
> @@ -653,15 +674,29 @@ static uint64_t virtio_mem_get_features(VirtIODevice *vdev, uint64_t features,
>                                          Error **errp)
>  {
>      MachineState *ms = MACHINE(qdev_get_machine());
> +    VirtIOMEM *vmem = VIRTIO_MEM(vdev);
>  
>      if (ms->numa_state) {
>  #if defined(CONFIG_ACPI)
>          virtio_add_feature(&features, VIRTIO_MEM_F_ACPI_PXM);
>  #endif
>      }
> +    assert(vmem->unplugged_inaccessible != ON_OFF_AUTO_AUTO);
> +    if (vmem->unplugged_inaccessible == ON_OFF_AUTO_ON) {
> +        virtio_add_feature(&features, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE);
> +    }
>      return features;
>  }
>  
> +static int virtio_mem_validate_features(VirtIODevice *vdev)
> +{
> +    if (virtio_host_has_feature(vdev, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE) &&
> +        !virtio_vdev_has_feature(vdev, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE)) {
> +        return -EFAULT;
> +    }
> +    return 0;
> +}
> +
>  static void virtio_mem_system_reset(void *opaque)
>  {
>      VirtIOMEM *vmem = VIRTIO_MEM(opaque);
> @@ -716,6 +751,29 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp)
>      rb = vmem->memdev->mr.ram_block;
>      page_size = qemu_ram_pagesize(rb);
>  
> +#if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS)
> +    switch (vmem->unplugged_inaccessible) {
> +    case ON_OFF_AUTO_AUTO:
> +        if (virtio_mem_has_shared_zeropage(rb)) {
> +            vmem->unplugged_inaccessible = ON_OFF_AUTO_OFF;
> +        } else {
> +            vmem->unplugged_inaccessible = ON_OFF_AUTO_ON;
> +        }
> +        break;
> +    case ON_OFF_AUTO_OFF:
> +        if (!virtio_mem_has_shared_zeropage(rb)) {
> +            warn_report("'%s' property set to 'off' with a memdev that does"
> +                        " not support the shared zeropage.",
> +                        VIRTIO_MEM_UNPLUGGED_INACCESSIBLE_PROP);
> +        }
> +        break;
> +    default:
> +        break;
> +    }
> +#else /* VIRTIO_MEM_HAS_LEGACY_GUESTS */
> +    vmem->unplugged_inaccessible = ON_OFF_AUTO_ON;
> +#endif /* VIRTIO_MEM_HAS_LEGACY_GUESTS */
> +
>      /*
>       * If the block size wasn't configured by the user, use a sane default. This
>       * allows using hugetlbfs backends of any page size without manual
> @@ -1109,6 +1167,10 @@ static Property virtio_mem_properties[] = {
>      DEFINE_PROP_UINT32(VIRTIO_MEM_NODE_PROP, VirtIOMEM, node, 0),
>      DEFINE_PROP_LINK(VIRTIO_MEM_MEMDEV_PROP, VirtIOMEM, memdev,
>                       TYPE_MEMORY_BACKEND, HostMemoryBackend *),
> +#if defined(VIRTIO_MEM_HAS_LEGACY_GUESTS)
> +    DEFINE_PROP_ON_OFF_AUTO(VIRTIO_MEM_UNPLUGGED_INACCESSIBLE_PROP, VirtIOMEM,
> +                            unplugged_inaccessible, ON_OFF_AUTO_OFF),

In theory, libvirt likes properties to be arch agnostic, but that boat
sailed away already. And honestly, the default is reasonable so I don't
think we will need to query for this capability.

> +#endif
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

Michal



  reply	other threads:[~2021-11-30 15:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30  9:28 [PATCH v1 0/3] virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE David Hildenbrand
2021-11-30  9:28 ` [PATCH v1 1/3] linux-headers: sync VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE David Hildenbrand
2021-11-30 15:34   ` Michal Prívozník
2021-11-30 23:45   ` Michael S. Tsirkin
2021-11-30  9:28 ` [PATCH v1 2/3] virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE David Hildenbrand
2021-11-30 15:34   ` Michal Prívozník [this message]
2021-11-30  9:28 ` [PATCH v1 3/3] virtio-mem: Set "unplugged-inaccessible=auto" for the 6.2 machine on x86 David Hildenbrand
2021-11-30 15:34   ` Michal Prívozník
2021-11-30 15:53     ` David Hildenbrand
2021-12-08  8:11   ` Pankaj Gupta
2021-11-30 23:45 ` [PATCH v1 0/3] virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE Michael S. Tsirkin
2021-12-01  8:23   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebafe902-b345-1b07-e77f-56289816a542@redhat.com \
    --to=mprivozn@redhat.com \
    --cc=david@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=gshan@redhat.com \
    --cc=mst@redhat.com \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=sebastien.boeuf@intel.com \
    --cc=teawater@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).