From: Avi Kivity <avi@qumranet.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: kvm-devel <kvm-devel@lists.sourceforge.net>,
linux-kernel@vger.kernel.org
Subject: Re: [announce] [patch] KVM paravirtualization for Linux
Date: Mon, 08 Jan 2007 10:22:38 +0200 [thread overview]
Message-ID: <45A1FF4E.1020106@qumranet.com> (raw)
In-Reply-To: <20070107174416.GA14607@elte.hu>
Ingo Molnar wrote:
>> This is a little too good to be true. Were both runs with the same
>> KVM_NUM_MMU_PAGES?
>>
>
> yes, both had the same elevated KVM_NUM_MMU_PAGES of 2048. The 'trunk'
> run should have been labeled as: 'cr3 tree with paravirt turned off'.
> That's not completely 'trunk' but close to it, and all other changes
> (like elimination of unnecessary TLB flushes) are fairly applied to
> both.
>
Ok. I guess there's a switch/switch back pattern in there.
> i also did a run with much less MMU cache pages of 256, and hackbench 1
> stayed the same, while hackbench 5 numbers started fluctuating badly (i
> think that workload if trashing the MMU cache badly).
>
Yes, 256 is too low.
>
>>> - u64 *pae_root;
>>> + u64 *pae_root[KVM_CR3_CACHE_SIZE];
>>>
>> hmm. wouldn't it be simpler to have pae_root always point at the
>> current root?
>>
>
> does that guarantee that it's available? I wanted to 'pin' the root
> itself this way, to make sure that if a guest switches to it via the
> cache, that it's truly available and a valid root. cr3 addresses are
> non-virtual so this is the only mechanism available to guarantee that
> the host-side memory truly contains a root pagetable.
>
>
I meant
u64 *pae_root_cache;
u64 *pae_root; /* == pae_root_cache + 4*cache_index */
so that the rest of the code need not worry about the cache.
>>> + vcpu->mmu.pae_root[j][i] = INVALID_PAGE;
>>> + }
>>> }
>>> vcpu->mmu.root_hpa = INVALID_PAGE;
>>> }
>>>
>> You keep the page directories pinned here. [...]
>>
>
> yes.
>
>
>> [...] This can be a problem if a guest frees a page directory, and
>> then starts using it as a regular page. kvm sometimes chooses not to
>> emulate a write to a guest page table, but instead to zap it, which is
>> impossible when the page is freed. You need to either unpin the page
>> when that happens, or add a hypercall to let kvm know when a page
>> directory is freed.
>>
>
> the cache is zapped upon pagefaults anyway, so unpinning ought to be
> possible. Which one would you prefer?
>
It's zapped by the equivalent of mmu_free_roots(), right? That's
effectively unpinning it (by zeroing ->root_count).
However, kvm takes pagefaults even for silly things like setting (in
hardware) or clearing (in software) the dirty bit.
>
>
>>> +#define KVM_API_MAGIC 0x87654321
>>> +
>>>
>> <linux/kvm.h> is the vmm userspace interface. The guest/host
>> interface should probably go somewhere else.
>>
>
> yeah. kvm_para.h?
>
>
Sounds good.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
WARNING: multiple messages have this Message-ID (diff)
From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
Cc: kvm-devel
<kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [announce] [patch] KVM paravirtualization for Linux
Date: Mon, 08 Jan 2007 10:22:38 +0200 [thread overview]
Message-ID: <45A1FF4E.1020106@qumranet.com> (raw)
In-Reply-To: <20070107174416.GA14607-X9Un+BFzKDI@public.gmane.org>
Ingo Molnar wrote:
>> This is a little too good to be true. Were both runs with the same
>> KVM_NUM_MMU_PAGES?
>>
>
> yes, both had the same elevated KVM_NUM_MMU_PAGES of 2048. The 'trunk'
> run should have been labeled as: 'cr3 tree with paravirt turned off'.
> That's not completely 'trunk' but close to it, and all other changes
> (like elimination of unnecessary TLB flushes) are fairly applied to
> both.
>
Ok. I guess there's a switch/switch back pattern in there.
> i also did a run with much less MMU cache pages of 256, and hackbench 1
> stayed the same, while hackbench 5 numbers started fluctuating badly (i
> think that workload if trashing the MMU cache badly).
>
Yes, 256 is too low.
>
>>> - u64 *pae_root;
>>> + u64 *pae_root[KVM_CR3_CACHE_SIZE];
>>>
>> hmm. wouldn't it be simpler to have pae_root always point at the
>> current root?
>>
>
> does that guarantee that it's available? I wanted to 'pin' the root
> itself this way, to make sure that if a guest switches to it via the
> cache, that it's truly available and a valid root. cr3 addresses are
> non-virtual so this is the only mechanism available to guarantee that
> the host-side memory truly contains a root pagetable.
>
>
I meant
u64 *pae_root_cache;
u64 *pae_root; /* == pae_root_cache + 4*cache_index */
so that the rest of the code need not worry about the cache.
>>> + vcpu->mmu.pae_root[j][i] = INVALID_PAGE;
>>> + }
>>> }
>>> vcpu->mmu.root_hpa = INVALID_PAGE;
>>> }
>>>
>> You keep the page directories pinned here. [...]
>>
>
> yes.
>
>
>> [...] This can be a problem if a guest frees a page directory, and
>> then starts using it as a regular page. kvm sometimes chooses not to
>> emulate a write to a guest page table, but instead to zap it, which is
>> impossible when the page is freed. You need to either unpin the page
>> when that happens, or add a hypercall to let kvm know when a page
>> directory is freed.
>>
>
> the cache is zapped upon pagefaults anyway, so unpinning ought to be
> possible. Which one would you prefer?
>
It's zapped by the equivalent of mmu_free_roots(), right? That's
effectively unpinning it (by zeroing ->root_count).
However, kvm takes pagefaults even for silly things like setting (in
hardware) or clearing (in software) the dirty bit.
>
>
>>> +#define KVM_API_MAGIC 0x87654321
>>> +
>>>
>> <linux/kvm.h> is the vmm userspace interface. The guest/host
>> interface should probably go somewhere else.
>>
>
> yeah. kvm_para.h?
>
>
Sounds good.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
next prev parent reply other threads:[~2007-01-08 8:22 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-05 21:52 [announce] [patch] KVM paravirtualization for Linux Ingo Molnar
2007-01-05 21:52 ` Ingo Molnar
2007-01-05 22:15 ` Zachary Amsden
2007-01-05 22:15 ` Zachary Amsden
2007-01-05 22:30 ` Ingo Molnar
2007-01-05 22:30 ` Ingo Molnar
2007-01-05 22:50 ` Zachary Amsden
2007-01-05 22:50 ` Zachary Amsden
2007-01-05 23:28 ` Ingo Molnar
2007-01-05 23:02 ` [kvm-devel] " Anthony Liguori
2007-01-06 13:08 ` Pavel Machek
2007-01-06 13:08 ` Pavel Machek
2007-01-07 18:29 ` Christoph Hellwig
2007-01-07 18:29 ` Christoph Hellwig
2007-01-08 18:18 ` Christoph Lameter
2007-01-07 12:20 ` Avi Kivity
2007-01-07 12:20 ` Avi Kivity
2007-01-07 17:42 ` [kvm-devel] " Hollis Blanchard
2007-01-07 17:42 ` Hollis Blanchard
2007-01-07 17:44 ` Ingo Molnar
2007-01-07 17:44 ` Ingo Molnar
2007-01-08 8:22 ` Avi Kivity [this message]
2007-01-08 8:22 ` Avi Kivity
2007-01-08 8:39 ` Ingo Molnar
2007-01-08 8:39 ` Ingo Molnar
2007-01-08 9:08 ` Avi Kivity
2007-01-08 9:08 ` Avi Kivity
2007-01-08 9:18 ` Ingo Molnar
2007-01-08 9:18 ` Ingo Molnar
2007-01-08 9:31 ` Avi Kivity
2007-01-08 9:31 ` Avi Kivity
2007-01-08 9:43 ` Ingo Molnar
2007-01-08 9:43 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45A1FF4E.1020106@qumranet.com \
--to=avi@qumranet.com \
--cc=kvm-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.