qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, Greg Kurz <groug@kaod.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Murilo Opsfelder Araujo <muriloo@linux.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Stefan Weil <sw@weilnetz.de>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [PATCH v1 00/13] Ram blocks with resizable anonymous allocations under POSIX
Date: Fri, 7 Feb 2020 15:28:47 +0000	[thread overview]
Message-ID: <20200207152847.GG3302@work-vm> (raw)
In-Reply-To: <13585E49-B84C-41D8-8825-F96841F475D0@redhat.com>

* David Hildenbrand (david@redhat.com) wrote:
> 
> 
> > Am 06.02.2020 um 21:11 schrieb Dr. David Alan Gilbert <dgilbert@redhat.com>:
> > 
> > * David Hildenbrand (david@redhat.com) wrote:
> >> We already allow resizable ram blocks for anonymous memory, however, they
> >> are not actually resized. All memory is mmaped() R/W, including the memory
> >> exceeding the used_length, up to the max_length.
> >> 
> >> When resizing, effectively only the boundary is moved. Implement actually
> >> resizable anonymous allocations and make use of them in resizable ram
> >> blocks when possible. Memory exceeding the used_length will be
> >> inaccessible. Especially ram block notifiers require care.
> >> 
> >> Having actually resizable anonymous allocations (via mmap-hackery) allows
> >> to reserve a big region in virtual address space and grow the
> >> accessible/usable part on demand. Even if "/proc/sys/vm/overcommit_memory"
> >> is set to "never" under Linux, huge reservations will succeed. If there is
> >> not enough memory when resizing (to populate parts of the reserved region),
> >> trying to resize will fail. Only the actually used size is reserved in the
> >> OS.
> >> 
> >> E.g., virtio-mem [1] wants to reserve big resizable memory regions and
> >> grow the usable part on demand. I think this change is worth sending out
> >> individually. Accompanied by a bunch of minor fixes and cleanups.
> >> 
> >> [1] https://lore.kernel.org/kvm/20191212171137.13872-1-david@redhat.com/
> > 
> > There's a few bits I've not understood from skimming the patches:
> > 
> 
> Thanks for having a look!
> 
> >  a) Am I correct in thinking you PROT_NONE the extra space so you can
> > gkrow/shrink it?
> 
> Yes!
> 
> >  b) What does kvm see - does it have a slot for the whole space or for
> > only the used space?
> 
> Only the used space. Resizing triggers a resize of the memory region. That triggers memory notifiers, which remove the old kvm memslot and re-add the new kvm memslot. (That‘s existing handling, so nothing new).
> 
> So KVM will not see PROT_NONE when creating a slot.

OK, that's easy then.

> >     I ask because we found with virtiofs/DAX experiments that on Power,
> > kvm gets upset if you give it a mapping with PROT_NONE.
> >     (That maybe less of an issue if you change the mapping after the
> > slot is created).
> 
> That should work as expected. Resizing *while* kvm is running is tricky, but that‘s not part of this series and a different story :) right now, resizing is only valid on reboot/incoming migration.

Hmm 'when' during an incoming migration; I ask because of userfaultfd
setup for postcopy.  Also note those things can combine - i.e. a reboot
that happens during a migration (we've already got a pile of related
bugs).

> > 
> >  c) It's interesting this is keyed off the RAMBlock notifiers - do
> >     memory_listener's on the address space the block is mapped into get
> >    triggered?  I'm wondering how vhost (and vhost-user) in particular
> >    see this.
> 
> Yes, memory listeners get triggered. Old region is removed, new one is added. Nothing changed on that front.
> 
> The issue with ram block notifiers is that they did not do a „remove old, add new“ on resizes. They only added the full ram block. Bad. E.g., vfio wants to pin all memory - which would fail on PROT_NONE.
> 
> E.g., for HAX, there is no kernel ioctl to remove a ram block ... for SEV there is, but I am not sure about the implications when converting back and forth between encrypted/unencrypted. So SEV and HAX require legacy handling.

I guess for a memory listener it just sees a new layout after the commit
and then can figure out what changed.

Dave

> Cheers!
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2020-02-07 15:29 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-03 18:31 [PATCH v1 00/13] Ram blocks with resizable anonymous allocations under POSIX David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 01/13] util: vfio-helpers: Factor out and fix processing of existings ram blocks David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 02/13] exec: Factor out setting ram settings (madvise ...) into qemu_ram_apply_settings() David Hildenbrand
2020-02-06 11:42   ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 03/13] exec: Reuse qemu_ram_apply_settings() in qemu_ram_remap() David Hildenbrand
2020-02-06 11:43   ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 04/13] exec: Drop "shared" parameter from ram_block_add() David Hildenbrand
2020-02-06 11:44   ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 05/13] util/mmap-alloc: Factor out calculation of pagesize to mmap_pagesize() David Hildenbrand
2020-02-05 19:37   ` Murilo Opsfelder Araújo
2020-02-06 11:46   ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 06/13] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve() David Hildenbrand
2020-02-05 19:40   ` Murilo Opsfelder Araújo
2020-02-06 11:55   ` Richard Henderson
2020-02-06 13:16     ` David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 07/13] util/mmap-alloc: Factor out populating of memory to mmap_populate() David Hildenbrand
2020-02-05 19:56   ` Murilo Opsfelder Araújo
2020-02-06  9:26     ` David Hildenbrand
2020-02-06 11:59   ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 08/13] util/mmap-alloc: Prepare for resizable mmaps David Hildenbrand
2020-02-05 23:00   ` Murilo Opsfelder Araújo
2020-02-06  8:52     ` David Hildenbrand
2020-02-06 12:31       ` Murilo Opsfelder Araújo
2020-02-06 13:16         ` David Hildenbrand
2020-02-06 15:13     ` David Hildenbrand
2020-02-06 12:02   ` Richard Henderson
2020-02-03 18:31 ` [PATCH v1 09/13] util/mmap-alloc: Implement " David Hildenbrand
2020-02-06 12:08   ` Richard Henderson
2020-02-06 13:22     ` David Hildenbrand
2020-02-06 15:27       ` David Hildenbrand
2020-02-07  0:29   ` Murilo Opsfelder Araújo
2020-02-10  9:39     ` David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 10/13] numa: Introduce ram_block_notify_resized() and ram_block_notifiers_support_resize() David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 11/13] util: vfio-helpers: Implement ram_block_resized() David Hildenbrand
2020-02-10 13:41   ` David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 12/13] util: oslib: Resizable anonymous allocations under POSIX David Hildenbrand
2020-02-03 18:31 ` [PATCH v1 13/13] exec: Ram blocks with resizable " David Hildenbrand
2020-02-10 10:12   ` David Hildenbrand
2020-02-06  9:27 ` [PATCH v1 00/13] " Michael S. Tsirkin
2020-02-06  9:45   ` David Hildenbrand
2020-02-06 20:11 ` Dr. David Alan Gilbert
2020-02-06 20:31   ` David Hildenbrand
2020-02-07 15:28     ` Dr. David Alan Gilbert [this message]
2020-02-10  9:47       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200207152847.GG3302@work-vm \
    --to=dgilbert@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=david@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=groug@kaod.org \
    --cc=mst@redhat.com \
    --cc=muriloo@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).