From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: Eduardo Habkost <ehabkost@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
Stefan Weil <sw@weilnetz.de>,
David Hildenbrand <david@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Igor Mammedov <imammedo@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>
Subject: [PATCH v2 fixed 16/16] exec: Ram blocks with resizable anonymous allocations under POSIX
Date: Wed, 12 Feb 2020 14:42:54 +0100 [thread overview]
Message-ID: <20200212134254.11073-17-david@redhat.com> (raw)
In-Reply-To: <20200212134254.11073-1-david@redhat.com>
We can now make use of resizable anonymous allocations to implement
actually resizable ram blocks. Resizable anonymous allocations are
not implemented under WIN32 yet and are not available when using
alternative allocators. Fall back to the existing handling.
We also have to fallback to the existing handling in case any ram block
notifier does not support resizing (esp., AMD SEV, HAX) yet. Remember
in RAM_RESIZEABLE_ALLOC if we are using resizable anonymous allocations.
As the mmap()-hackery will invalidate some madvise settings, we have to
re-apply them after resizing. After resizing, notify the ram block
notifiers.
Try to grow early, as that can easily fail if out of memory. Shrink late
and ignore errors (nothing will actually break). Warn only.
The benefit of actually resizable ram blocks is that e.g., under Linux,
only the actual size will be reserved (even if
"/proc/sys/vm/overcommit_memory" is set to "never"). Additional memory will
be reserved when trying to resize, which allows to have ram blocks that
start small but can theoretically grow very large.
Note1: We are not able to create resizable ram blocks with pre-allocated
memory yet, so prealloc is not affected.
Note2: mlock should work as it used to as os_mlock() does a
mlockall(MCL_CURRENT | MCL_FUTURE), which includes future
mappings.
Note3: Nobody should access memory beyond used_length. Memory notifiers
already properly take care of this, only ram block notifiers
violate this constraint and, therefore, have to be special-cased.
Especially, any ram block notifier that might dynamically
register at runtime (e.g., vfio), has to support resizes. Add an
assert for that. Both, HAX and SEV register early, so they are
fine.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Stefan Weil <sw@weilnetz.de>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
exec.c | 60 ++++++++++++++++++++++++++++++++++++---
hw/core/numa.c | 7 +++++
include/exec/cpu-common.h | 2 ++
include/exec/memory.h | 8 ++++++
4 files changed, 73 insertions(+), 4 deletions(-)
diff --git a/exec.c b/exec.c
index f2d30479b8..71e32dcc11 100644
--- a/exec.c
+++ b/exec.c
@@ -2053,6 +2053,16 @@ void qemu_ram_unset_migratable(RAMBlock *rb)
rb->flags &= ~RAM_MIGRATABLE;
}
+bool qemu_ram_is_resizable(RAMBlock *rb)
+{
+ return rb->flags & RAM_RESIZEABLE;
+}
+
+bool qemu_ram_is_resizable_alloc(RAMBlock *rb)
+{
+ return rb->flags & RAM_RESIZEABLE_ALLOC;
+}
+
/* Called with iothread lock held. */
void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
{
@@ -2139,6 +2149,7 @@ static void qemu_ram_apply_settings(void *host, size_t length)
*/
int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
{
+ const bool shared = block->flags & RAM_SHARED;
const ram_addr_t oldsize = block->used_length;
assert(block);
@@ -2149,7 +2160,7 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
return 0;
}
- if (!(block->flags & RAM_RESIZEABLE)) {
+ if (!qemu_ram_is_resizable(block)) {
error_setg_errno(errp, EINVAL,
"Length mismatch: %s: 0x" RAM_ADDR_FMT
" in != 0x" RAM_ADDR_FMT, block->idstr,
@@ -2165,6 +2176,12 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
return -EINVAL;
}
+ if (oldsize < newsize && qemu_ram_is_resizable_alloc(block) &&
+ !qemu_anon_ram_resize(block->host, oldsize, newsize, shared)) {
+ error_setg_errno(errp, -ENOMEM, "Cannot allocate enough memory.");
+ return -ENOMEM;
+ }
+
cpu_physical_memory_clear_dirty_range(block->offset, block->used_length);
block->used_length = newsize;
cpu_physical_memory_set_dirty_range(block->offset, block->used_length,
@@ -2178,6 +2195,21 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
if (block->resized) {
block->resized(block->idstr, newsize, block->host);
}
+
+ /*
+ * Shrinking will only fail in rare scenarios (e.g., maximum number of
+ * mappings reached), and can be ignored. Warn only.
+ */
+ if (newsize < oldsize && qemu_ram_is_resizable_alloc(block) &&
+ !qemu_anon_ram_resize(block->host, oldsize, newsize, shared)) {
+ warn_report("Shrinking memory allocation failed.");
+ }
+
+ if (block->host && qemu_ram_is_resizable_alloc(block)) {
+ /* re-apply settings that might have been overriden by the resize */
+ qemu_ram_apply_settings(block->host, block->max_length);
+ }
+
return 0;
}
@@ -2256,6 +2288,28 @@ static void dirty_memory_extend(ram_addr_t old_ram_size,
}
}
+static void ram_block_alloc_ram(RAMBlock *rb)
+{
+ const bool shared = qemu_ram_is_shared(rb);
+
+ /*
+ * If we can, try to allocate actually resizable ram. Will also fail
+ * if qemu_anon_ram_alloc_resizable() is not implemented.
+ */
+ if (phys_mem_alloc == qemu_anon_ram_alloc &&
+ qemu_ram_is_resizable(rb) &&
+ ram_block_notifiers_support_resize()) {
+ rb->host = qemu_anon_ram_alloc_resizable(rb->used_length,
+ rb->max_length, &rb->mr->align,
+ shared);
+ if (rb->host) {
+ rb->flags |= RAM_RESIZEABLE_ALLOC;
+ return;
+ }
+ }
+ rb->host = phys_mem_alloc(rb->max_length, &rb->mr->align, shared);
+}
+
static void ram_block_add(RAMBlock *new_block, Error **errp)
{
RAMBlock *block;
@@ -2278,9 +2332,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
return;
}
} else {
- new_block->host = phys_mem_alloc(new_block->max_length,
- &new_block->mr->align,
- qemu_ram_is_shared(new_block));
+ ram_block_alloc_ram(new_block);
if (!new_block->host) {
error_setg_errno(errp, errno,
"cannot set up guest memory '%s'",
diff --git a/hw/core/numa.c b/hw/core/numa.c
index 5b20dc726d..601cf9f603 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -907,6 +907,13 @@ static int ram_block_notify_add_single(RAMBlock *rb, void *opaque)
RAMBlockNotifier *notifier = opaque;
if (host) {
+ /*
+ * Dynamically adding notifiers that don't support resizes is forbidden
+ * when dealing with resizable ram blocks that have actually resizable
+ * allocations.
+ */
+ g_assert(!qemu_ram_is_resizable_alloc(rb) ||
+ notifier->ram_block_resized);
notifier->ram_block_added(notifier, host, size, max_size);
}
return 0;
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 9760ac9068..a9c76bd5ef 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -66,6 +66,8 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb);
bool qemu_ram_is_migratable(RAMBlock *rb);
void qemu_ram_set_migratable(RAMBlock *rb);
void qemu_ram_unset_migratable(RAMBlock *rb);
+bool qemu_ram_is_resizable(RAMBlock *rb);
+bool qemu_ram_is_resizable_alloc(RAMBlock *rb);
size_t qemu_ram_pagesize(RAMBlock *block);
size_t qemu_ram_pagesize_largest(void);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index e85b7de99a..19417943a2 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -129,6 +129,14 @@ typedef struct IOMMUNotifier IOMMUNotifier;
/* RAM is a persistent kind memory */
#define RAM_PMEM (1 << 5)
+/*
+ * Implies RAM_RESIZEABLE. Memory beyond the used_length is inaccessible
+ * (esp. initially and after resizing). For such memory blocks, only the
+ * used_length is reserved in the OS - resizing might fail. Will only be
+ * used with host OS support and if all ram block notifiers support resizing.
+ */
+#define RAM_RESIZEABLE_ALLOC (1 << 6)
+
static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
IOMMUNotifierFlag flags,
hwaddr start, hwaddr end,
--
2.24.1
next prev parent reply other threads:[~2020-02-12 13:56 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-12 13:42 [PATCH v2 fixed 00/16] Ram blocks with resizable anonymous allocations under POSIX David Hildenbrand
2020-02-12 13:42 ` [PATCH v2 fixed 01/16] util: vfio-helpers: Factor out and fix processing of existing ram blocks David Hildenbrand
2020-02-18 22:00 ` Peter Xu
2020-02-19 8:43 ` David Hildenbrand
2020-02-19 11:27 ` David Hildenbrand
2020-02-19 17:34 ` Peter Xu
2020-02-12 13:42 ` [PATCH v2 fixed 02/16] util: vfio-helpers: Fix qemu_vfio_close() David Hildenbrand
2020-02-18 22:00 ` Peter Xu
2020-02-12 13:42 ` [PATCH v2 fixed 03/16] util: vfio-helpers: Remove Error parameter from qemu_vfio_undo_mapping() David Hildenbrand
2020-02-18 22:07 ` Peter Xu
2020-02-19 8:49 ` David Hildenbrand
2020-02-12 13:42 ` [PATCH v2 fixed 04/16] util: vfio-helpers: Factor out removal " David Hildenbrand
2020-02-18 22:10 ` Peter Xu
2020-02-12 13:42 ` [PATCH v2 fixed 05/16] exec: Factor out setting ram settings (madvise ...) into qemu_ram_apply_settings() David Hildenbrand
2020-02-18 22:10 ` Peter Xu
2020-02-12 13:42 ` [PATCH v2 fixed 06/16] exec: Reuse qemu_ram_apply_settings() in qemu_ram_remap() David Hildenbrand
2020-02-18 22:11 ` Peter Xu
2020-02-12 13:42 ` [PATCH v2 fixed 07/16] exec: Drop "shared" parameter from ram_block_add() David Hildenbrand
2020-02-18 22:12 ` Peter Xu
2020-02-12 13:42 ` [PATCH v2 fixed 08/16] util/mmap-alloc: Factor out calculation of pagesize to mmap_pagesize() David Hildenbrand
2020-02-19 22:46 ` Peter Xu
2020-02-24 10:50 ` David Hildenbrand
2020-02-24 10:57 ` David Hildenbrand
2020-02-24 14:16 ` Murilo Opsfelder Araújo
2020-02-24 14:25 ` Murilo Opsfelder Araújo
2020-02-24 14:39 ` David Hildenbrand
2020-02-26 17:36 ` David Hildenbrand
2020-02-24 17:36 ` Peter Xu
2020-02-24 18:37 ` David Hildenbrand
2020-02-12 13:42 ` [PATCH v2 fixed 09/16] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve() David Hildenbrand
2020-02-19 22:47 ` Peter Xu
2020-02-12 13:42 ` [PATCH v2 fixed 10/16] util/mmap-alloc: Factor out populating of memory to mmap_populate() David Hildenbrand
2020-02-19 22:49 ` Peter Xu
2020-02-24 10:54 ` David Hildenbrand
2020-02-12 13:42 ` [PATCH v2 fixed 11/16] util/mmap-alloc: Prepare for resizable mmaps David Hildenbrand
2020-02-19 22:50 ` Peter Xu
2020-02-24 11:00 ` David Hildenbrand
2020-02-12 13:42 ` [PATCH v2 fixed 12/16] util/mmap-alloc: Implement " David Hildenbrand
2020-02-12 13:42 ` [Xen-devel] [PATCH v2 fixed 13/16] numa: Teach ram block notifiers about resizable ram blocks David Hildenbrand
2020-02-12 13:42 ` David Hildenbrand
2020-02-13 12:41 ` [Xen-devel] " Paul Durrant
2020-02-13 12:41 ` Paul Durrant
2020-02-12 13:42 ` [PATCH v2 fixed 14/16] util: vfio-helpers: Implement ram_block_resized() David Hildenbrand
2020-02-12 13:42 ` [PATCH v2 fixed 15/16] util: oslib: Resizable anonymous allocations under POSIX David Hildenbrand
2020-02-12 13:42 ` David Hildenbrand [this message]
2020-02-12 18:03 ` [PATCH v2 fixed 00/16] Ram blocks with resizable " David Hildenbrand
2020-02-13 13:36 ` David Hildenbrand
2020-02-14 13:08 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200212134254.11073-17-david@redhat.com \
--to=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=sw@weilnetz.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.