From: Paolo Bonzini <pbonzini@redhat.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Xiao Guangrong <guangrong.xiao@linux.intel.com>,
bdas@redhat.com
Subject: Re: [PATCH 08/11] KVM: implement multiple address spaces
Date: Wed, 20 May 2015 09:07:25 +0200 [thread overview]
Message-ID: <555C32AD.4040704@redhat.com> (raw)
In-Reply-To: <20150519182810.GA29273@potion.brq.redhat.com>
On 19/05/2015 20:28, Radim Krčmář wrote:
> If I understand your aim correctly, we could go with just one dirty map
> in this way:
> - have a dirty bitmap that covers userspace memory of all slots that
> have enabled dirty log
> - store an offset to this bitmap in those slots
>
> It would be a framework that allows userspace to query dirty pages based
> on userspace address space. (Changing slots would be costly, but I
> don't think it's done very often.)
Yes, that would work. However, it would be a significant change to the
KVM dirty bitmap model. Right now it is GPA-indexed, and it would
become HVA-indexed.
The difference arises if the same memory is aliased at different parts
of the address space. Aliasing can happen if the 0xA0000 VRAM is a
window inside a larger framebuffer, for example; or it can happen across
separate address spaces. You cannot be sure that *some* userspace is
relying on the properties of a GPA-indexed dirty bitmap. Certainly not
QEMU, but we try to be userspace-agnostic.
>> On the
>> contrary, it's very easy if the VCPU can simply query its current memslots.
>
> Yeah, userspace drew the short stick.
Userspace is the slow path, so it makes sense that it did. :)
> I didn't want to solve the "multiple bitmaps for same userspace" because
> KVM_GET_DIRTY_LOG doesn't care about userspace memory space dirtiness
> (thought it's probably what we use it for), but only about the guest
> address space; KVM_GET_DIRTY_LOG just says which which page was
> modified in a slot :/
> (If we wanted dirty bitmap for userspace memory, it would be better to
> use the solution at the top.)
Userspace implements dirty bitmap for userspace memory as the OR of the
KVM dirty bitmap with its own dirty bitmap. There's no particular
advantage to a GPA-indexed dirty bitmap; as you noticed, GPA and HVA
indexing can be equally fast because GPA->HVA mapping is just an add.
It's just that the dirty bitmap it's always been GPA-indexed, and it's
too late to change it.
>> A possible optimization is to set a flag when no bits are set in the
>> dirty bitmap, and skip the iteration. This is the common case for the
>> SMM memory space for example. But it can be done later, and is an
>> independent work.
>
> True. (I'm not a huge fan of optimizations through exceptions.)
It depends on how often the exception kicks in... :)
>> Keeping slots separate for different address spaces also makes the most
>> sense in QEMU, because each address space has a separate MemoryListener
>> that only knows about one address space. There is one log_sync callback
>> per address space and no code has to know about all address spaces.
>
> Ah, I would have thought this is done per slot, when there was no
> concept of address space in KVM before this patch :)
Right; the callback is _called_ per slot, but _registered_ per address
space. Remember that QEMU does have address spaces.
>> The regular and SMM address spaces are not hierarchical. As soon as you
>> put a PCI resource underneath SMRAM---which is exactly what happens for
>> legacy VRAM at 0xa0000---they can be completely different. Note that
>> QEMU can map legacy VRAM as a KVM memslot when using the VGA 320x200x256
>> color mode (this mapping is not correct from the VGA point of view, but
>> it cannot be changed in QEMU without breaking migration).
>
> How is a PCI resource under SMRAM accessed?
> I thought that outside SMM, PCI resource under SMRAM is working
> normally, but it will be overshadowed, and made inaccessible, in SMM.
Yes, it is. (There is some chipset magic to make instruction fetches
retrieve SMRAM and data fetches retrieve PCI resources. I guess you
could use execute-only EPT permissions, but needless to say, we don't care).
> I'm not sure if we mean the same hierarchy. I meant hierarchy in the
> sense than one address space is considered before the other.
> (Maybe layers would be a better word.)
> SMM address space could have just one slot and be above regular, we'd
> then decide how to handle overlapping.
Ah, now I understand. That would be doable.
But as they say, "All programming is an exercise in caching." In this
case, the caching is done by userspace.
QEMU implements the SMM address space exactly by overlaying SMRAM over
normal memory:
/* Outer container */
memory_region_init(&smram_as_root, OBJECT(kvm_state),
"mem-container-smm", ~0ull);
memory_region_set_enabled(&smm_as_root, true);
/* Normal memory with priority 0 */
memory_region_init_alias(&smm_as_mem, OBJECT(kvm_state), "mem-smm",
get_system_memory(), 0, ~0ull);
memory_region_add_subregion_overlap(&smm_as_root, 0, &smm_as_mem, 0);
memory_region_set_enabled(&smm_as_mem, true);
/* SMRAM with priority 10 */
memory_region_add_subregion_overlap(&smm_as_root, 0, smram, 10);
memory_region_set_enabled(smram, true);
The caching consists simply in resolving the overlaps beforehand, thus
giving KVM the complete address space.
Since slots do not change often, the simpler code is not worth the
potentially more expensive KVM_SET_USER_MEMORY_REGION (it _is_ more
expensive, if only because it has to be called twice per slot change).
>> What I do dislike in the API is the 16-bit split of the slots field; but
>> the alternative of defining KVM_SET_USER_MEMORY_REGION2 and
>> KVM_GET_DIRTY_LOG2 ioctls is just as sad. If you prefer it that way, I
>> can certainly do that.
>
> No, what we have now is much better. I'd like to know why a taken
> route is better than a one I'd prefer, which probably makes me look
> like I have a problem with everything, but your solution is neat.
No problem, it's good because otherwise I take too much advantage of,
well, being able to commit to kvm.git. :)
(Also, it's not entirely my solution. My solution was v1 and the
userspace patches showed that it was a mess and a dead end... Compared
to v1, this one requires much more care in the MMU. The same gfn can
have two completely different mapping and you have to use the right one
when e.g. accessing the rmap. But that's pretty much the only downside).
Paolo
next prev parent reply other threads:[~2015-05-20 7:07 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-18 13:48 [RFC PATCH 00/11] KVM: multiple address spaces (for SMM) Paolo Bonzini
2015-05-18 13:48 ` [PATCH 01/11] KVM: introduce kvm_alloc/free_memslots Paolo Bonzini
2015-05-18 13:48 ` [PATCH 02/11] KVM: use kvm_memslots whenever possible Paolo Bonzini
2015-05-18 13:48 ` [PATCH 03/11] KVM: const-ify uses of struct kvm_userspace_memory_region Paolo Bonzini
2015-05-18 13:48 ` [PATCH 04/11] KVM: add memslots argument to kvm_arch_memslots_updated Paolo Bonzini
2015-05-18 13:48 ` [PATCH 05/11] KVM: add "new" argument to kvm_arch_commit_memory_region Paolo Bonzini
2015-05-18 13:48 ` [PATCH 06/11] KVM: x86: pass kvm_mmu_page to gfn_to_rmap Paolo Bonzini
2015-05-20 8:30 ` Xiao Guangrong
2015-05-20 9:07 ` Paolo Bonzini
2015-05-18 13:48 ` [PATCH 07/11] KVM: add vcpu-specific functions to read/write/translate GFNs Paolo Bonzini
2015-05-18 13:48 ` [PATCH 08/11] KVM: implement multiple address spaces Paolo Bonzini
2015-05-19 13:32 ` Radim Krčmář
2015-05-19 16:19 ` Paolo Bonzini
2015-05-19 18:28 ` Radim Krčmář
2015-05-20 7:07 ` Paolo Bonzini [this message]
2015-05-20 15:46 ` Radim Krčmář
2015-05-20 18:17 ` Paolo Bonzini
2015-05-18 13:48 ` [PATCH 09/11] KVM: x86: use vcpu-specific functions to read/write/translate GFNs Paolo Bonzini
2015-05-18 13:48 ` [PATCH 10/11] KVM: x86: work on all available address spaces Paolo Bonzini
2015-05-18 13:48 ` [PATCH 11/11] KVM: x86: add SMM to the MMU role, support SMRAM address space Paolo Bonzini
2015-05-20 8:34 ` Xiao Guangrong
2015-05-20 8:57 ` Paolo Bonzini
2015-05-20 8:31 ` [RFC PATCH 00/11] KVM: multiple address spaces (for SMM) Christian Borntraeger
2015-05-20 8:58 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=555C32AD.4040704@redhat.com \
--to=pbonzini@redhat.com \
--cc=bdas@redhat.com \
--cc=guangrong.xiao@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).