From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
qemu-devel@nongnu.org, Greg Kurz <groug@kaod.org>,
Alex Williamson <alex.williamson@redhat.com>,
Murilo Opsfelder Araujo <muriloo@linux.ibm.com>,
Paolo Bonzini <pbonzini@redhat.com>, Stefan Weil <sw@weilnetz.de>,
Richard Henderson <rth@twiddle.net>
Subject: Re: [PATCH v1 00/13] Ram blocks with resizable anonymous allocations under POSIX
Date: Fri, 7 Feb 2020 15:28:47 +0000 [thread overview]
Message-ID: <20200207152847.GG3302@work-vm> (raw)
In-Reply-To: <13585E49-B84C-41D8-8825-F96841F475D0@redhat.com>
* David Hildenbrand (david@redhat.com) wrote:
>
>
> > Am 06.02.2020 um 21:11 schrieb Dr. David Alan Gilbert <dgilbert@redhat.com>:
> >
> > * David Hildenbrand (david@redhat.com) wrote:
> >> We already allow resizable ram blocks for anonymous memory, however, they
> >> are not actually resized. All memory is mmaped() R/W, including the memory
> >> exceeding the used_length, up to the max_length.
> >>
> >> When resizing, effectively only the boundary is moved. Implement actually
> >> resizable anonymous allocations and make use of them in resizable ram
> >> blocks when possible. Memory exceeding the used_length will be
> >> inaccessible. Especially ram block notifiers require care.
> >>
> >> Having actually resizable anonymous allocations (via mmap-hackery) allows
> >> to reserve a big region in virtual address space and grow the
> >> accessible/usable part on demand. Even if "/proc/sys/vm/overcommit_memory"
> >> is set to "never" under Linux, huge reservations will succeed. If there is
> >> not enough memory when resizing (to populate parts of the reserved region),
> >> trying to resize will fail. Only the actually used size is reserved in the
> >> OS.
> >>
> >> E.g., virtio-mem [1] wants to reserve big resizable memory regions and
> >> grow the usable part on demand. I think this change is worth sending out
> >> individually. Accompanied by a bunch of minor fixes and cleanups.
> >>
> >> [1] https://lore.kernel.org/kvm/20191212171137.13872-1-david@redhat.com/
> >
> > There's a few bits I've not understood from skimming the patches:
> >
>
> Thanks for having a look!
>
> > a) Am I correct in thinking you PROT_NONE the extra space so you can
> > gkrow/shrink it?
>
> Yes!
>
> > b) What does kvm see - does it have a slot for the whole space or for
> > only the used space?
>
> Only the used space. Resizing triggers a resize of the memory region. That triggers memory notifiers, which remove the old kvm memslot and re-add the new kvm memslot. (That‘s existing handling, so nothing new).
>
> So KVM will not see PROT_NONE when creating a slot.
OK, that's easy then.
> > I ask because we found with virtiofs/DAX experiments that on Power,
> > kvm gets upset if you give it a mapping with PROT_NONE.
> > (That maybe less of an issue if you change the mapping after the
> > slot is created).
>
> That should work as expected. Resizing *while* kvm is running is tricky, but that‘s not part of this series and a different story :) right now, resizing is only valid on reboot/incoming migration.
Hmm 'when' during an incoming migration; I ask because of userfaultfd
setup for postcopy. Also note those things can combine - i.e. a reboot
that happens during a migration (we've already got a pile of related
bugs).
> >
> > c) It's interesting this is keyed off the RAMBlock notifiers - do
> > memory_listener's on the address space the block is mapped into get
> > triggered? I'm wondering how vhost (and vhost-user) in particular
> > see this.
>
> Yes, memory listeners get triggered. Old region is removed, new one is added. Nothing changed on that front.
>
> The issue with ram block notifiers is that they did not do a „remove old, add new“ on resizes. They only added the full ram block. Bad. E.g., vfio wants to pin all memory - which would fail on PROT_NONE.
>
> E.g., for HAX, there is no kernel ioctl to remove a ram block ... for SEV there is, but I am not sure about the implications when converting back and forth between encrypted/unencrypted. So SEV and HAX require legacy handling.
I guess for a memory listener it just sees a new layout after the commit
and then can figure out what changed.
Dave
> Cheers!
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2020-02-07 15:29 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-03 18:31 [PATCH v1 00/13] Ram blocks with resizable anonymous allocations under POSIX David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 01/13] util: vfio-helpers: Factor out and fix processing of existings ram blocks David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 02/13] exec: Factor out setting ram settings (madvise ...) into qemu_ram_apply_settings() David Hildenbrand
2020-02-06 11:42 ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 03/13] exec: Reuse qemu_ram_apply_settings() in qemu_ram_remap() David Hildenbrand
2020-02-06 11:43 ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 04/13] exec: Drop "shared" parameter from ram_block_add() David Hildenbrand
2020-02-06 11:44 ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 05/13] util/mmap-alloc: Factor out calculation of pagesize to mmap_pagesize() David Hildenbrand
2020-02-05 19:37 ` Murilo Opsfelder Araújo
2020-02-06 11:46 ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 06/13] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve() David Hildenbrand
2020-02-05 19:40 ` Murilo Opsfelder Araújo
2020-02-06 11:55 ` Richard Henderson
2020-02-06 13:16 ` David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 07/13] util/mmap-alloc: Factor out populating of memory to mmap_populate() David Hildenbrand
2020-02-05 19:56 ` Murilo Opsfelder Araújo
2020-02-06 9:26 ` David Hildenbrand
2020-02-06 11:59 ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 08/13] util/mmap-alloc: Prepare for resizable mmaps David Hildenbrand
2020-02-05 23:00 ` Murilo Opsfelder Araújo
2020-02-06 8:52 ` David Hildenbrand
2020-02-06 12:31 ` Murilo Opsfelder Araújo
2020-02-06 13:16 ` David Hildenbrand
2020-02-06 15:13 ` David Hildenbrand
2020-02-06 12:02 ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 09/13] util/mmap-alloc: Implement " David Hildenbrand
2020-02-06 12:08 ` Richard Henderson
2020-02-06 13:22 ` David Hildenbrand
2020-02-06 15:27 ` David Hildenbrand
2020-02-07 0:29 ` Murilo Opsfelder Araújo
2020-02-10 9:39 ` David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 10/13] numa: Introduce ram_block_notify_resized() and ram_block_notifiers_support_resize() David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 11/13] util: vfio-helpers: Implement ram_block_resized() David Hildenbrand
2020-02-10 13:41 ` David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 12/13] util: oslib: Resizable anonymous allocations under POSIX David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 13/13] exec: Ram blocks with resizable " David Hildenbrand
2020-02-10 10:12 ` David Hildenbrand
2020-02-06 9:27 ` [PATCH v1 00/13] " Michael S. Tsirkin
2020-02-06 9:45 ` David Hildenbrand
2020-02-06 20:11 ` Dr. David Alan Gilbert
2020-02-06 20:31 ` David Hildenbrand
2020-02-07 15:28 ` Dr. David Alan Gilbert [this message]
2020-02-10 9:47 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200207152847.GG3302@work-vm \
--to=dgilbert@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=david@redhat.com \
--cc=ehabkost@redhat.com \
--cc=groug@kaod.org \
--cc=mst@redhat.com \
--cc=muriloo@linux.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
--cc=sw@weilnetz.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).