All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Jintack Lim <incredible.tack@gmail.com>, pbonzini@redhat.com
Cc: QEMU Devel Mailing List <qemu-devel@nongnu.org>
Subject: Re: Migration failure when running nested VMs
Date: Mon, 23 Sep 2019 11:42:45 +0100	[thread overview]
Message-ID: <20190923104245.GA2866@work-vm> (raw)
In-Reply-To: <CAHyh4xhYrUbK0aEJmKp3_kOJG2E+AQLMUjyf7_pXVJgbqgv5JA@mail.gmail.com>

* Jintack Lim (incredible.tack@gmail.com) wrote:
> Hi,

Copying in Paolo, since he recently did work to fix nested migration -
it was expected to be broken until pretty recently; but 4.1.0 qemu on
5.3 kernel is pretty new, so I think I'd expected it to work.

> I'm seeing VM live migration failure when a VM is running a nested VM.
> I'm using latest Linux kernel (v5.3) and QEMU (v4.1.0). I also tried
> v5.2, but the result was the same. Kernel versions in L1 and L2 VM are
> v4.18, but I don't think that matters.
> 
> The symptom is that L2 VM kernel crashes in different places after
> migration but the call stack is mostly related to memory management
> like [1] and [2]. The kernel crash happens almost all the time. While
> L2 VM gets kernel panic, L1 VM runs fine after the migration. Both L1
> and L2 VM were doing nothing during migration.
> 
> I found a few clues about this issue.
> 1) It happens with a relatively large memory for L1 (24G), but it does
> not with a smaller size (3G).
> 
> 2) Dead migration worked; when I ran "stop" command in the qemu
> monitor for L1 first and did migration, migration worked always. It
> also worked when I only stopped L2 VM and kept L1 live during the
> migration.
> 
> With those two clues, I guess maybe some dirty pages made by L2 are
> not transferred to the destination correctly, but I'm not really sure.
> 
> 3) It happens on Intel(R) Xeon(R) Silver 4114 CPU, but it does not on
> Intel(R) Xeon(R) CPU E5-2630 v3 CPU.
> 
> This makes me confused because I thought migrating nested state
> doesn't depend on the underlying hardware.. Anyways, L1-only migration
> with the large memory size (24G) works on both CPUs without any
> problem.
> 
> I would appreciate any comments/suggestions to fix this problem.

Can you share the qemu command lines you're using for both L1 and L2
please ?
Are there any dmesg entries around the time of the migration on either
the hosts or the L1 VMs?
What guest OS are you running in L1 and L2?

Dave

> Thanks,
> Jintack
> 
> 
> [1]https://paste.ubuntu.com/p/XGDKH45yt4/
> [2]https://paste.ubuntu.com/p/CpbVTXJCyc/
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


  reply	other threads:[~2019-09-23 10:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-20 19:01 Migration failure when running nested VMs Jintack Lim
2019-09-23 10:42 ` Dr. David Alan Gilbert [this message]
2019-09-23 11:48   ` Paolo Bonzini
2019-09-23 18:32     ` Jintack Lim
2019-09-24  0:19       ` Paolo Bonzini
2019-09-23 18:32   ` Jintack Lim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190923104245.GA2866@work-vm \
    --to=dgilbert@redhat.com \
    --cc=incredible.tack@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.