From: Avi Kivity <avi@qumranet.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: linux-kernel@vger.kernel.org, kvm-devel@lists.sourceforge.net
Subject: Re: [PATCH 6/13] KVM: memory slot management
Date: Fri, 27 Oct 2006 15:26:03 +0200 [thread overview]
Message-ID: <454208EB.7080007@qumranet.com> (raw)
In-Reply-To: <200610270937.11646.arnd@arndb.de>
Arnd Bergmann wrote:
> On Friday 27 October 2006 07:47, Avi Kivity wrote:
>
>> Arnd Bergmann wrote:
>>
>>> - no need to preallocate memory that the guest doesn't actually use.
>>>
>>>
>> Well, a fully vitrualized guest will likely use all the memory it gets.
>> Linux certainly will.
>>
>
> Only if it does lots of disk accesses that load stuff into
> page/inode/dentry cache. Single-application guests don't necessarily
> do that.
>
>
Okay. FWIW, you can demand allocate with other schemes as well.
>>> - guest memory can be paged to disk.
>>> - you can mmap files into multiple guest for fast communication
>>> - you can use mmap host files as backing store for guest blockdevices,
>>> including ext2 with the -o xip mount option to avoid double paging
>>>
>>>
>> What do you mean exactly? to respond to a block device read by mmap()ing
>> the backing file into the pages the host requested?
>>
>> (e.g. turn a host bio read into a guest mmap)
>>
>
> The idea would be to mmap the file into the guest real address space.
> With -o xip, the page cache for the virtual device would basically
> reside in that high address range.
>
Ah, I see what you mean now. Like the "memory technology device" thing.
> Guest users reading/writing files on it cause a memcopy between guest
> user space and the host file mapping, done by the guest file system
> implementation.
>
> The interesting point here is how to handle a host page fault on the
> file mapping. The solution on z/VM for this is to generate a special
> exception for this that will be caught by the guest kernel, telling
> it to wait until the page is there. The guest kernel can then put the
> current thread to sleep and do something else, until a second exception
> tells it that the page has been loaded by the host. The guest then
> wakes up the sleeping thread.
>
> This can work the same way for host file backed (guest block device)
> and host anonymous (guest RAM) memory.
>
>
Certainly something like that can be done, for paravirtualized guests.
>> If we allow the pages to be writable, the guest could write into the
>> virtual block device just by modifying a read page (which might have be
>> discarded and no longer related to the block device)
>>
>
> In your virtual mmu (or nested page table), you need to make sure that
> the page is mapped with the intersection of the guest vm_prot and host
> vm_prot into guest users.
>
>
Yes. My comment was based on an incorrect understanding of your suggestion.
>> 2. The next mmu implementation, which caches guest translations.
>>
>> The potential problem above now becomes acute. The guest will have
>> kernel mappings for every page, and after a short while they'll all be
>> faulted in and locked. This defeats the swap integration which is IMO a
>> very strong point.
>>
>> We can work around that by periodically forcing out translations (some
>> kind of clock algorithm) at some rate so the host vm can have a go at
>> them. That can turn out to be expensive as we'll need to interrupt all
>> running vcpus to flush (real) tlb entries.
>>
>
> Don't understand. Can't one CPU cause a TLB entry to be flushed on all
> CPUs?
>
>
It's not about tlb entries. The shadow page tables collaples a GV -> HV
-> HP double translation into a GV -> HP page table. When the Linux vm
goes around evicting pages, it invalidates those mappings.
There are two solutions possible: lock pages which participate in these
translations (and their number can be large) or modify the Linux vm to
consult a reverse mapping and remove the translations (in which case TLB
entries need to be removed).
>> b. we need to hide the userspace portion of the monitor from the
>> guest physical address space
>>
>
> That depends on your trust model. You could simply say that you expect
> the guest real mode to have the same privileges as the host application
> (your monitor), and not care if a guest can shoot itself in the foot
> by overwriting the monitor.
>
It can shoot not only its foot, but anything the monitor's uid has
access to. Host files, the host network, other guests belonging to the
user, etc.
>> c. we need to extend host tlb invalidations to invalidate tlbs on guests
>>
>
> I don't understand much about the x86 specific memory management,
> but shouldn't a TLB invalidate of a given page do the right thing
> on all CPUs, even if they are currently running a guest?
>
It's worse than I thouht: tlb entries generated by guest accesses are
tagged with the guest virtual address, to if you remove a guest
physical/host virtual page you need to invalidate the entire guest tlb.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
next prev parent reply other threads:[~2006-10-27 13:26 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-26 17:19 [PATCH 0/13] KVM: Kernel-based Virtual Machine (v3) Avi Kivity
2006-10-26 17:22 ` [PATCH 1/13] KVM: userspace interface Avi Kivity
[not found] ` <200610270051.43477.arnd@arndb.de>
2006-10-27 5:51 ` Avi Kivity
2006-10-26 17:23 ` [PATCH 2/13] KVM: Intel virtual mode extensions definitions Avi Kivity
2006-10-26 17:24 ` [PATCH 3/13] KVM: kvm data structures Avi Kivity
2006-10-26 22:55 ` Arnd Bergmann
2006-10-27 5:53 ` Avi Kivity
2006-10-27 7:39 ` Arnd Bergmann
2006-10-26 17:25 ` [PATCH 4/13] KVM: random accessors and constants Avi Kivity
2006-10-26 17:26 ` [PATCH 5/13] KVM: virtualization infrastructure Avi Kivity
2006-10-26 17:27 ` [PATCH 6/13] KVM: memory slot management Avi Kivity
2006-10-26 22:44 ` Arnd Bergmann
2006-10-27 5:47 ` Avi Kivity
2006-10-27 7:37 ` Arnd Bergmann
2006-10-27 13:26 ` Avi Kivity [this message]
2006-10-27 14:05 ` Arnd Bergmann
2006-10-29 9:10 ` Avi Kivity
2006-10-27 15:43 ` [kvm-devel] " Anthony Liguori
2006-10-29 9:15 ` Avi Kivity
2006-10-26 17:28 ` [PATCH 7/13] KVM: vcpu creation and maintenance Avi Kivity
2006-10-26 17:29 ` [PATCH 8/13] KVM: vcpu execution loop Avi Kivity
2006-10-26 17:30 ` [PATCH 9/13] KVM: define exit handlers Avi Kivity
2006-10-26 17:31 ` [PATCH 10/13] KVM: less common " Avi Kivity
2006-10-26 17:32 ` [PATCH 11/13] KVM: mmu Avi Kivity
2006-10-26 17:33 ` [PATCH 12/13] KVM: x86 emulator Avi Kivity
2006-10-26 17:34 ` [PATCH 13/13] KVM: plumbing Avi Kivity
-- strict thread matches above, loose matches on Subject: below --
2006-10-23 13:28 [PATCH 0/7] KVM: Kernel-based Virtual Machine (v2) Avi Kivity
2006-10-23 13:30 ` [PATCH 6/13] KVM: memory slot management Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=454208EB.7080007@qumranet.com \
--to=avi@qumranet.com \
--cc=arnd@arndb.de \
--cc=kvm-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox