From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "ehabkost@redhat.com" <ehabkost@redhat.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"quintela@redhat.com" <quintela@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Roman Kagan <rkagan@virtuozzo.com>,
"amit.shah@redhat.com" <amit.shah@redhat.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"virtualization@lists.linux-foundation.org"
<virtualization@lists.linux-foundation.org>,
"rth@twiddle.net" <rth@twiddle.net>
Subject: Re: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization
Date: Fri, 4 Mar 2016 16:45:29 +0200 [thread overview]
Message-ID: <20160304163246-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E0414516C@shsmsx102.ccr.corp.intel.com>
On Fri, Mar 04, 2016 at 02:26:49PM +0000, Li, Liang Z wrote:
> > Subject: Re: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration
> > optimization
> >
> > On Fri, Mar 04, 2016 at 09:08:44AM +0000, Li, Liang Z wrote:
> > > > On Fri, Mar 04, 2016 at 01:52:53AM +0000, Li, Liang Z wrote:
> > > > > > I wonder if it would be possible to avoid the kernel changes
> > > > > > by parsing /proc/self/pagemap - if that can be used to detect
> > > > > > unmapped/zero mapped pages in the guest ram, would it achieve
> > > > > > the
> > > > same result?
> > > > >
> > > > > Only detect the unmapped/zero mapped pages is not enough.
> > Consider
> > > > the
> > > > > situation like case 2, it can't achieve the same result.
> > > >
> > > > Your case 2 doesn't exist in the real world. If people could stop
> > > > their main memory consumer in the guest prior to migration they
> > > > wouldn't need live migration at all.
> > >
> > > The case 2 is just a simplified scenario, not a real case.
> > > As long as the guest's memory usage does not keep increasing, or not
> > > always run out, it can be covered by the case 2.
> >
> > The memory usage will keep increasing due to ever growing caches, etc, so
> > you'll be left with very little free memory fairly soon.
> >
>
> I don't think so.
Here's my laptop:
KiB Mem : 16048560 total, 8574956 free, 3360532 used, 4113072 buff/cache
But here's a server:
KiB Mem: 32892768 total, 20092812 used, 12799956 free, 368704 buffers
What is the difference? A ton of tiny daemons not doing anything,
staying resident in memory.
> > > > I tend to think you can safely assume there's no free memory in the
> > > > guest, so there's little point optimizing for it.
> > >
> > > If this is true, we should not inflate the balloon either.
> >
> > We certainly should if there's "available" memory, i.e. not free but cheap to
> > reclaim.
> >
>
> What's your mean by "available" memory? if they are not free, I don't think it's cheap.
clean pages are cheap to drop as they don't have to be written.
whether they will be ever be used is another matter.
> > > > OTOH it makes perfect sense optimizing for the unmapped memory
> > > > that's made up, in particular, by the ballon, and consider inflating
> > > > the balloon right before migration unless you already maintain it at
> > > > the optimal size for other reasons (like e.g. a global resource manager
> > optimizing the VM density).
> > > >
> > >
> > > Yes, I believe the current balloon works and it's simple. Do you take the
> > performance impact for consideration?
> > > For and 8G guest, it takes about 5s to inflating the balloon. But it
> > > only takes 20ms to traverse the free_list and construct the free pages
> > bitmap.
> >
> > I don't have any feeling of how important the difference is. And if the
> > limiting factor for balloon inflation speed is the granularity of communication
> > it may be worth optimizing that, because quick balloon reaction may be
> > important in certain resource management scenarios.
> >
> > > By inflating the balloon, all the guest's pages are still be processed (zero
> > page checking).
> >
> > Not sure what you mean. If you describe the current state of affairs that's
> > exactly the suggested optimization point: skip unmapped pages.
> >
>
> You'd better check the live migration code.
What's there to check in migration code?
Here's the extent of what balloon does on output:
while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
ram_addr_t pa;
ram_addr_t addr;
int p = virtio_ldl_p(vdev, &pfn);
pa = (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT;
offset += 4;
/* FIXME: remove get_system_memory(), but how? */
section = memory_region_find(get_system_memory(), pa, 1);
if (!int128_nz(section.size) || !memory_region_is_ram(section.mr))
continue;
trace_virtio_balloon_handle_output(memory_region_name(section.mr),
pa);
/* Using memory_region_get_ram_ptr is bending the rules a bit, but
should be OK because we only want a single page. */
addr = section.offset_within_region;
balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
!!(vq == s->dvq));
memory_region_unref(section.mr);
}
so all that happens when we get a page is balloon_page.
and
static void balloon_page(void *addr, int deflate)
{
#if defined(__linux__)
if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
kvm_has_sync_mmu())) {
qemu_madvise(addr, TARGET_PAGE_SIZE,
deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
}
#endif
}
Do you see anything that tracks pages to help migration skip
the ballooned memory? I don't.
> > > The only advantage of ' inflating the balloon before live migration' is simple,
> > nothing more.
> >
> > That's a big advantage. Another one is that it does something useful in real-
> > world scenarios.
> >
>
> I don't think the heave performance impaction is something useful in real world scenarios.
>
> Liang
> > Roman.
So fix the performance then. You will have to try harder if you want to
convince people that the performance is due to bad host/guest interface,
and so we have to change *that*.
--
MST
next prev parent reply other threads:[~2016-03-04 14:45 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-03 10:44 [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization Liang Li
2016-03-03 10:44 ` [Qemu-devel] [RFC qemu 1/4] pc: Add code to get the lowmem form PCMachineState Liang Li
2016-03-03 10:44 ` [Qemu-devel] [RFC qemu 2/4] virtio-balloon: Add a new feature to balloon device Liang Li
2016-03-03 12:23 ` Cornelia Huck
2016-03-04 2:38 ` Li, Liang Z
2016-03-03 12:56 ` Michael S. Tsirkin
2016-03-04 2:29 ` Li, Liang Z
2016-03-03 10:44 ` [Qemu-devel] [RFC qemu 3/4] migration: not set migration bitmap in setup stage Liang Li
2016-03-03 10:44 ` [Qemu-devel] [RFC qemu 4/4] migration: filter out guest's free pages in ram bulk stage Liang Li
2016-03-03 12:16 ` Cornelia Huck
2016-03-04 2:32 ` Li, Liang Z
2016-03-03 12:45 ` Daniel P. Berrange
2016-03-04 2:43 ` Li, Liang Z
2016-03-03 13:58 ` [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization Roman Kagan
2016-03-04 1:35 ` Li, Liang Z
2016-03-03 17:46 ` Dr. David Alan Gilbert
2016-03-04 1:52 ` Li, Liang Z
2016-03-04 8:14 ` Roman Kagan
2016-03-04 9:08 ` Li, Liang Z
2016-03-04 10:23 ` Roman Kagan
2016-03-04 14:26 ` Li, Liang Z
2016-03-04 14:45 ` Michael S. Tsirkin [this message]
2016-03-04 15:49 ` Li, Liang Z
2016-03-05 19:55 ` Michael S. Tsirkin
2016-03-07 6:49 ` Li, Liang Z
2016-03-07 11:40 ` Michael S. Tsirkin
2016-03-07 15:06 ` Li, Liang Z
2016-03-09 14:28 ` Roman Kagan
2016-03-09 15:27 ` Li, Liang Z
2016-03-09 15:30 ` Michael S. Tsirkin
2016-03-10 1:41 ` Li, Liang Z
2016-03-10 12:29 ` Michael S. Tsirkin
2016-03-09 15:41 ` Michael S. Tsirkin
2016-03-09 17:04 ` Roman Kagan
2016-03-09 17:39 ` Michael S. Tsirkin
2016-03-10 10:21 ` Roman Kagan
2016-03-09 19:38 ` Rik van Riel
2016-03-10 9:30 ` Roman Kagan
2016-03-04 16:24 ` Paolo Bonzini
2016-03-04 18:51 ` Dr. David Alan Gilbert
2016-03-07 5:34 ` Li, Liang Z
2016-03-09 13:22 ` Roman Kagan
2016-03-09 14:19 ` Li, Liang Z
2016-03-09 6:18 ` Li, Liang Z
2016-03-04 7:55 ` Roman Kagan
2016-03-04 8:23 ` Li, Liang Z
2016-03-04 8:35 ` Roman Kagan
2016-03-04 9:08 ` Dr. David Alan Gilbert
2016-03-04 9:12 ` Li, Liang Z
2016-03-04 9:47 ` Michael S. Tsirkin
2016-03-04 10:11 ` Li, Liang Z
2016-03-04 10:36 ` Michael S. Tsirkin
2016-03-04 15:13 ` Li, Liang Z
2016-03-08 14:03 ` Michael S. Tsirkin
2016-03-08 14:17 ` Li, Liang Z
2016-03-04 9:35 ` Roman Kagan
2016-03-08 11:13 ` Amit Shah
2016-03-08 13:11 ` Li, Liang Z
2016-03-10 7:44 ` Li, Liang Z
2016-03-10 7:57 ` Amit Shah
2016-03-10 8:36 ` Li, Liang Z
2016-03-10 11:18 ` Dr. David Alan Gilbert
2016-03-11 2:38 ` Li, Liang Z
2016-03-14 17:03 ` Dr. David Alan Gilbert
2016-03-15 3:31 ` Li, Liang Z
2016-03-15 10:29 ` Michael S. Tsirkin
2016-03-15 11:11 ` Li, Liang Z
2016-03-15 19:55 ` Dr. David Alan Gilbert
2016-03-16 1:20 ` Li, Liang Z
-- strict thread matches above, loose matches on Subject: below --
2016-03-04 9:32 Jitendra Kolhe
2016-03-04 9:36 ` Li, Liang Z
2016-03-08 11:14 ` Amit Shah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160304163246-mutt-send-email-mst@redhat.com \
--to=mst@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=amit.shah@redhat.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=liang.z.li@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=rkagan@virtuozzo.com \
--cc=rth@twiddle.net \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).