From: Kashyap Chamarthy <kchamart@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, vkuznets@redhat.com
Subject: Re: [PATCH] docs/virt/kvm: Document running nested guests
Date: Fri, 7 Feb 2020 17:40:54 +0100 [thread overview]
Message-ID: <20200207164054.GB30317@paraplu> (raw)
In-Reply-To: <20200207160157.GI3302@work-vm>
On Fri, Feb 07, 2020 at 04:01:57PM +0000, Dr. David Alan Gilbert wrote:
> * Kashyap Chamarthy (kchamart@redhat.com) wrote:
[...]
> > +Running nested guests with KVM
> > +==============================
> > +
> > +A nested guest is a KVM guest that in turn runs on a KVM guest::
>
> Note nesting maybe a little more general; e.g. L1 might be another
> OS/hypervisor that wants to run it's own L2; and similarly
> KVM might be the L1 under someone elses hypervisor.
True, I narrowly focused on KVM-on-KVM.
Will take this approach: I'll mention the generic nature of nesting, but
focus on KVM-on-KVM in this document.
> I think this doc is mostly about the case of KVM being the L0
> and wanting to run an L1 that's capable of running an L2.
>
> > + .----------------. .----------------.
> > + | | | |
> > + | L2 | | L2 |
> > + | (Nested Guest) | | (Nested Guest) |
> > + | | | |
> > + |----------------'--'----------------|
> > + | |
> > + | L1 (Guest Hypervisor) |
> > + | KVM (/dev/kvm) |
> > + | |
> > + .------------------------------------------------------.
> > + | L0 (Host Hypervisor) |
> > + | KVM (/dev/kvm) |
> > + |------------------------------------------------------|
> > + | x86 Hardware (VMX) |
> > + '------------------------------------------------------'
>
> This is now x86 specific but the doc is in a general directory;
> I'm not sure what other architecture nesting rules are.
Yeah, x86 is the beast I knew, so I stuck to it. But since this is
upstream doc, I should bear in mind to clearly mention s390x and other
architectures.
> Woth having VMX/SVM at least.
Will add.
[...]
> > +
> > +Use Cases
> > +---------
> > +
> > +An additional layer of virtualization sometimes can . You
> > +might have access to a large virtual machine in a cloud environment that
> > +you want to compartmentalize into multiple workloads. You might be
> > +running a lab environment in a training session.
>
> Lose this paragraph, and just use the list below?
That was precisely my intention, but I didn't commit the local version
before sending. Will fix in v2.
> > +There are several scenarios where nested KVM can be Useful:
> > +
> > + - As a developer, you want to test your software on different OSes.
> > + Instead of renting multiple VMs from a Cloud Provider, using nested
> > + KVM lets you rent a large enough "guest hypervisor" (level-1 guest).
> > + This in turn allows you to create multiple nested guests (level-2
> > + guests), running different OSes, on which you can develop and test
> > + your software.
> > +
> > + - Live migration of "guest hypervisors" and their nested guests, for
> > + load balancing, disaster recovery, etc.
> > +
> > + - Using VMs for isolation (as in Kata Containers, and before it Clear
> > + Containers https://lwn.net/Articles/644675/) if you're running on a
> > + cloud provider that is already using virtual machines
The last use-case was pointed out by Paolo elsewhere. (I should make
this more generic.)
> Some others that might be worth listing;
> - VM image creation tools (e.g. virt-install etc) often run their own
> VM, and users expect these to work inside a VM.
> - Some other OS's use virtualization internally for other
> features/protection.
Yeah. Will add; thanks!
> > +Procedure to enable nesting on the bare metal host
> > +--------------------------------------------------
> > +
> > +The KVM kernel modules do not enable nesting by default (though your
> > +distribution may override this default).
>
> It's the other way; see 1e58e5e for intel has made it default; AMD has
> it set as default for longer.
Ah, this was another bit I realized later, but forgot to fix before
sending to the list. (I recall seeing this when it came out about a
year ago:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e58e5e)
Will fix. Thanks for the eagle eyes :-)
> > +Additional nested-related kernel parameters
> > +-------------------------------------------
> > +
> > +If your hardware is sufficiently advanced (Intel Haswell processor or
> > +above which has newer hardware virt extensions), you might want to
> > +enable additional features: "Shadow VMCS (Virtual Machine Control
> > +Structure)", APIC Virtualization on your bare metal host (L0).
> > +Parameters for Intel hosts::
> > +
> > + $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
> > + Y
> > +
> > + $ cat /sys/module/kvm_intel/parameters/enable_apicv
> > + N
> > +
> > + $ cat /sys/module/kvm_intel/parameters/ept
> > + Y
>
> Don't those happen automatically (mostly?)
EPT, yes. I forget if `enable_shadow_vmcs` and `enable_apivc` are.
I'll investigate and update.
[...]
> > +Limitations on Linux kernel versions older than 5.3
> > +---------------------------------------------------
> > +
> > +On Linux kernel versions older than 5.3, once an L1 guest has started an
> > +L2 guest, the L1 guest would no longer capable of being migrated, saved,
> > +or loaded (refer to QEMU documentation on "save"/"load") until the L2
> > +guest shuts down. [FIXME: Is this limitation fixed for *all*
> > +architectures, including s390x?]
> > +
> > +Attempting to migrate or save & load an L1 guest while an L2 guest is
> > +running will result in undefined behavior. You might see a ``kernel
> > +BUG!`` entry in ``dmesg``, a kernel 'oops', or an outright kernel panic.
> > +Such a migrated or loaded L1 guest can no longer be considered stable or
> > +secure, and must be restarted.
> > +
> > +Migrating an L1 guest merely configured to support nesting, while not
> > +actually running L2 guests, is expected to function normally.
> > +Live-migrating an L2 guest from one L1 guest to another is also expected
> > +to succeed.
>
> Can you add an entry along the lines of 'reporting bugs with nesting'
> that explains you should clearly state what the host CPU is,
> and the exact OS and hypervisor config in L0,L1 and L2 ?
Yes, good point. I'll add a short version based my notes from here
(which you've reviewed in the past):
https://kashyapc.fedorapeople.org/Notes/_build/html/docs/Info-to-collect-when-debugging-nested-KVM.html#what-information-to-collect
Thanks for the review.
--
/kashyap
prev parent reply other threads:[~2020-02-07 16:41 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-07 15:30 [PATCH] docs/virt/kvm: Document running nested guests Kashyap Chamarthy
2020-02-07 15:46 ` Cornelia Huck
2020-02-07 16:26 ` Kashyap Chamarthy
2020-02-07 16:01 ` Dr. David Alan Gilbert
2020-02-07 16:40 ` Kashyap Chamarthy [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200207164054.GB30317@paraplu \
--to=kchamart@redhat.com \
--cc=dgilbert@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox