From: Peter Xu <peterx@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: "Laurent Vivier" <lvivier@redhat.com>,
"Thomas Huth" <thuth@redhat.com>,
"Eduardo Habkost" <ehabkost@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Claudio Fontana" <cfontana@suse.de>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Alex Bennée" <alex.bennee@linaro.org>,
"Igor Mammedov" <imammedo@redhat.com>,
"Stefan Berger" <stefanb@linux.ibm.com>
Subject: Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
Date: Fri, 23 Jul 2021 11:28:17 -0400 [thread overview]
Message-ID: <YPrgEXkl2wsXYs03@t490s> (raw)
In-Reply-To: <20210720130304.26323-6-david@redhat.com>
On Tue, Jul 20, 2021 at 03:03:04PM +0200, David Hildenbrand wrote:
> virtio-mem logically plugs/unplugs memory within a sparse memory region
> and notifies via the RamDiscardManager interface when parts become
> plugged (populated) or unplugged (discarded).
>
> Currently, we end up (via the two users)
> 1) zeroing all logically unplugged/discarded memory during TPM resets.
> 2) reading all logically unplugged/discarded memory when dumping, to
> figure out the content is zero.
>
> 1) is always bad, because we assume unplugged memory stays discarded
> (and is already implicitly zero).
> 2) isn't that bad with anonymous memory, we end up reading the zero
> page (slow and unnecessary, though). However, once we use some
> file-backed memory (future use case), even reading will populate memory.
>
> Let's cut out all parts marked as not-populated (discarded) via the
> RamDiscardManager. As virtio-mem is the single user, this now means that
> logically unplugged memory ranges will no longer be included in the
> dump, which results in smaller dump files and faster dumping.
>
> virtio-mem has a minimum granularity of 1 MiB (and the default is usually
> 2 MiB). Theoretically, we can see quite some fragmentation, in practice
> we won't have it completely fragmented in 1 MiB pieces. Still, we might
> end up with many physical ranges.
>
> Both, the ELF format and kdump seem to be ready to support many
> individual ranges (e.g., for ELF it seems to be UINT32_MAX, kdump has a
> linear bitmap).
>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Claudio Fontana <cfontana@suse.de>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: "Alex Bennée" <alex.bennee@linaro.org>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Laurent Vivier <lvivier@redhat.com>
> Cc: Stefan Berger <stefanb@linux.ibm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> softmmu/memory_mapping.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/softmmu/memory_mapping.c b/softmmu/memory_mapping.c
> index b7e4f3f788..856778a109 100644
> --- a/softmmu/memory_mapping.c
> +++ b/softmmu/memory_mapping.c
> @@ -246,6 +246,15 @@ static void guest_phys_block_add_section(GuestPhysListener *g,
> #endif
> }
>
> +static int guest_phys_ram_populate_cb(MemoryRegionSection *section,
> + void *opaque)
> +{
> + GuestPhysListener *g = opaque;
> +
> + guest_phys_block_add_section(g, section);
> + return 0;
> +}
> +
> static void guest_phys_blocks_region_add(MemoryListener *listener,
> MemoryRegionSection *section)
> {
> @@ -257,6 +266,17 @@ static void guest_phys_blocks_region_add(MemoryListener *listener,
> memory_region_is_nonvolatile(section->mr)) {
> return;
> }
> +
> + /* for special sparse regions, only add populated parts */
> + if (memory_region_has_ram_discard_manager(section->mr)) {
> + RamDiscardManager *rdm;
> +
> + rdm = memory_region_get_ram_discard_manager(section->mr);
> + ram_discard_manager_replay_populated(rdm, section,
> + guest_phys_ram_populate_cb, g);
> + return;
> + }
> +
> guest_phys_block_add_section(g, section);
> }
As I've asked this question previously elsewhere, it's more or less also
related to the design decision of having virtio-mem being able to sparsely
plugged in such a small granularity rather than making the plug/unplug still
continuous within GPA range (so we move page when unplug).
There's definitely reasons there and I believe you're the expert on that (as
you mentioned once: some guest GUPed pages cannot migrate so cannot get those
ranges offlined otherwise), but so far I still not sure whether that's a kernel
issue to solve on GUP, although I agree it's a complicated one anyway!
Maybe it's a trade-off you made at last, I don't have enough knowledge to tell.
The patch itself looks okay to me, there's just a slight worry on not sure how
long would the list be at last; if it's chopped in 1M/2M small chunks.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2021-07-23 15:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-20 13:02 [PATCH resend v2 0/5] softmmu/memory_mapping: optimize dump/tpm for virtio-mem David Hildenbrand
2021-07-20 13:03 ` [PATCH resend v2 1/5] tpm: mark correct memory region range dirty when clearing RAM David Hildenbrand
2021-07-23 14:52 ` Peter Xu
2021-07-23 19:15 ` David Hildenbrand
2021-07-23 22:35 ` Peter Xu
2021-07-26 8:08 ` David Hildenbrand
2021-07-26 14:21 ` Peter Xu
2021-07-20 13:03 ` [PATCH resend v2 2/5] softmmu/memory_mapping: reuse qemu_get_guest_simple_memory_mapping() David Hildenbrand
2021-07-20 13:37 ` Stefan Berger
2021-07-20 13:45 ` David Hildenbrand
2021-07-20 13:03 ` [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions David Hildenbrand
2021-07-20 13:26 ` Stefan Berger
2021-07-23 15:09 ` Peter Xu
2021-07-20 13:03 ` [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges David Hildenbrand
2021-07-20 13:25 ` Stefan Berger
2021-07-23 15:09 ` Peter Xu
2021-07-20 13:03 ` [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections David Hildenbrand
2021-07-23 15:28 ` Peter Xu [this message]
2021-07-23 18:56 ` David Hildenbrand
2021-07-23 22:33 ` Peter Xu
2021-07-26 7:51 ` David Hildenbrand
2021-07-26 15:24 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YPrgEXkl2wsXYs03@t490s \
--to=peterx@redhat.com \
--cc=alex.bennee@linaro.org \
--cc=alex.williamson@redhat.com \
--cc=cfontana@suse.de \
--cc=david@redhat.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=lvivier@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanb@linux.ibm.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.