From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org, Peter Xu <peterx@redhat.com>,
Luiz Capitulino <lcapitulino@redhat.com>,
Auger Eric <eric.auger@redhat.com>,
Alex Williamson <alex.williamson@redhat.com>,
Wei Yang <richardw.yang@linux.intel.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Igor Mammedov <imammedo@redhat.com>
Subject: Re: [PATCH PROTOTYPE 0/6] virtio-mem: vfio support
Date: Tue, 29 Sep 2020 18:02:38 +0100 [thread overview]
Message-ID: <20200929170238.GN2826@work-vm> (raw)
In-Reply-To: <20200924160423.106747-1-david@redhat.com>
* David Hildenbrand (david@redhat.com) wrote:
> This is a quick and dirty (1.5 days of hacking) prototype to make
> vfio and virtio-mem play together. The basic idea was the result of Alex
> brainstorming with me on how to tackle this.
>
> A virtio-mem device manages a memory region in guest physical address
> space, represented as a single (currently large) memory region in QEMU.
> Before the guest is allowed to use memory blocks, it must coordinate with
> the hypervisor (plug blocks). After a reboot, all memory is usually
> unplugged - when the guest comes up, it detects the virtio-mem device and
> selects memory blocks to plug (based on requests from the hypervisor).
>
> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem
> device (triggered by the guest). When unplugging blocks, we discard the
> memory. In contrast to memory ballooning, we always know which memory
> blocks a guest may use - especially during a reboot, after a crash, or
> after kexec.
>
> The issue with vfio is, that it cannot deal with random discards - for this
> reason, virtio-mem and vfio can currently only run mutually exclusive.
> Especially, vfio would currently map the whole memory region (with possible
> only little/no plugged blocks), resulting in all pages getting pinned and
> therefore resulting in a higher memory consumption than expected (turning
> virtio-mem basically useless in these environments).
>
> To make vfio work nicely with virtio-mem, we have to map only the plugged
> blocks, and map/unmap properly when plugging/unplugging blocks (including
> discarding of RAM when unplugging). We achieve that by using a new notifier
> mechanism that communicates changes.
>
> It's important to map memory in the granularity in which we could see
> unmaps again (-> virtio-mem block size) - so when e.g., plugging
> consecutive 100 MB with a block size of 2MB, we need 50 mappings. When
> unmapping, we can use a single vfio_unmap call for the applicable range.
> We expect that the block size of virtio-mem devices will be fairly large
> in the future (to not run out of mappings and to improve hot(un)plug
> performance), configured by the user, when used with vfio (e.g., 128MB,
> 1G, ...) - Linux guests will still have to be optimized for that.
This seems pretty painful for those few TB mappings.
Also the calls seem pretty painful; maybe it'll be possible to have
calls that are optimised for making multiple consecutive mappings.
Dave
> We try to handle errors when plugging memory (mapping in VFIO) gracefully
> - especially to cope with too many mappings in VFIO.
>
>
> As I basically have no experience with vfio, all I did for testing is
> passthrough a secondary GPU (NVIDIA GK208B) via vfio-pci to my guest
> and saw it pop up in dmesg. I did *not* actually try to use it (I know
> ...), so there might still be plenty of BUGs regarding the actual mappings
> in the code. When I resize virtio-mem devices (resulting in
> memory hot(un)plug), I can spot the memory consumption of my host adjusting
> accordingly - in contrast to before, wehreby my machine would always
> consume the maximum size of my VM, as if all memory provided by
> virtio-mem devices were fully plugged.
>
> I even tested it with 2MB huge pages (sadly for the first time with
> virtio-mem ever) - and it worked like a charm on the hypervisor side as
> well. The number of free hugepages adjusted accordingly. (again, did not
> properly test the device in the guest ...).
>
> If anybody wants to play with it and needs some guidance, please feel
> free to ask. I might add some vfio-related documentation to
> https://virtio-mem.gitlab.io/ (but it really isn't that special - only
> the block size limitations have to be considered).
>
> David Hildenbrand (6):
> memory: Introduce sparse RAM handler for memory regions
> virtio-mem: Impelement SparseRAMHandler interface
> vfio: Implement support for sparse RAM memory regions
> memory: Extend ram_block_discard_(require|disable) by two discard
> types
> virtio-mem: Require only RAM_BLOCK_DISCARD_T_COORDINATED discards
> vfio: Disable only RAM_BLOCK_DISCARD_T_UNCOORDINATED discards
>
> exec.c | 109 +++++++++++++++++----
> hw/vfio/common.c | 169 ++++++++++++++++++++++++++++++++-
> hw/virtio/virtio-mem.c | 164 +++++++++++++++++++++++++++++++-
> include/exec/memory.h | 151 ++++++++++++++++++++++++++++-
> include/hw/vfio/vfio-common.h | 12 +++
> include/hw/virtio/virtio-mem.h | 3 +
> softmmu/memory.c | 7 ++
> 7 files changed, 583 insertions(+), 32 deletions(-)
>
> --
> 2.26.2
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2020-09-29 17:08 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-24 16:04 [PATCH PROTOTYPE 0/6] virtio-mem: vfio support David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 1/6] memory: Introduce sparse RAM handler for memory regions David Hildenbrand
2020-10-20 19:24 ` Peter Xu
2020-10-20 20:13 ` David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 2/6] virtio-mem: Impelement SparseRAMHandler interface David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 3/6] vfio: Implement support for sparse RAM memory regions David Hildenbrand
2020-10-20 19:44 ` Peter Xu
2020-10-20 20:01 ` David Hildenbrand
2020-10-20 20:44 ` Peter Xu
2020-11-12 10:11 ` David Hildenbrand
2020-11-18 13:04 ` David Hildenbrand
2020-11-18 15:23 ` Peter Xu
2020-11-18 16:14 ` David Hildenbrand
2020-11-18 17:01 ` Peter Xu
2020-11-18 17:37 ` David Hildenbrand
2020-11-18 19:05 ` Peter Xu
2020-11-18 19:20 ` David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 4/6] memory: Extend ram_block_discard_(require|disable) by two discard types David Hildenbrand
2020-10-20 19:17 ` Peter Xu
2020-10-20 19:58 ` David Hildenbrand
2020-10-20 20:49 ` Peter Xu
2020-10-20 21:30 ` Peter Xu
2020-09-24 16:04 ` [PATCH PROTOTYPE 5/6] virtio-mem: Require only RAM_BLOCK_DISCARD_T_COORDINATED discards David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 6/6] vfio: Disable only RAM_BLOCK_DISCARD_T_UNCOORDINATED discards David Hildenbrand
2020-09-24 19:30 ` [PATCH PROTOTYPE 0/6] virtio-mem: vfio support no-reply
2020-09-29 17:02 ` Dr. David Alan Gilbert [this message]
2020-09-29 17:05 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200929170238.GN2826@work-vm \
--to=dgilbert@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=david@redhat.com \
--cc=eric.auger@redhat.com \
--cc=imammedo@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=mst@redhat.com \
--cc=pankaj.gupta.linux@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richardw.yang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).