From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: [PATCH 12/13] KVM: x86: add KVM_MEM_X86_SMRAM memory slot flag Date: Wed, 06 May 2015 11:47:35 +0200 Message-ID: <5549E337.1090606@redhat.com> References: <1430393772-27208-1-git-send-email-pbonzini@redhat.com> <1430393772-27208-13-git-send-email-pbonzini@redhat.com> <20150505171747.GB17198@potion.brq.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, bsd@redhat.com, guangrong.xiao@linux.intel.com, Yang Zhang , wanpeng.li@linux.intel.com To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: In-Reply-To: <20150505171747.GB17198@potion.brq.redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 05/05/2015 19:17, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: > 2015-04-30 13:36+0200, Paolo Bonzini: >> This adds an arch-specific memslot flag that hides slots unless the >> VCPU is in system management mode. >> >> Some care is needed in order to limit the overhead of x86_gfn_to_mem= slot >> when compared with gfn_to_memslot. Thankfully, we have __gfn_to_mem= slot >> and search_memslots which are the same, so we can add some extra out= put >> to search_memslots. The compiler will optimize it as dead code in >> __gfn_to_memslot, and will use it to thread jumps in x86_gfn_to_mems= lot. >> >> Signed-off-by: Paolo Bonzini >> --- >> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtu= al/kvm/api.txt >> @@ -19,10 +19,23 @@ >> =20 >> #include >> #include >> +#include "kvm_cache_regs.h" >> =20 >> struct kvm_memory_slot *x86_gfn_to_memslot(struct kvm_vcpu *vcpu, g= fn_t gfn) >> { >> - struct kvm_memory_slot *slot =3D gfn_to_memslot(vcpu->kvm, gfn); >> + /* By using search_memslots directly the compiler can optimize awa= y >> + * the "if (found)" check below. >> + * >> + * It cannot do the same for gfn_to_memslot because it is not inli= ned, >> + * and it also cannot do the same for __gfn_to_memslot because the >> + * kernel is compiled with -fno-delete-null-pointer-checks. >> + */ >> + bool found; >> + struct kvm_memslots *memslots =3D kvm_memslots(vcpu->kvm); >> + struct kvm_memory_slot *slot =3D search_memslots(memslots, gfn, &f= ound); >> + >> + if (found && unlikely(slot->flags & KVM_MEM_X86_SMRAM) && !is_smm(= vcpu)) >> + return NULL; >=20 > Patch [10/13] made me sad and IIUIC, the line above is the only reaso= n > for it ... Yes, all the differences trickle down to using x86_gfn_to_memslot. On the other hand, there are already cut-and-pasted loops for guest=20 memory access, see kvm_write_guest_virt_system or=20 kvm_read_guest_virt_helper. We could add __-prefixed macros like #define __kvm_write_guest(fn_page, gpa, data, len, args...) \ ({ \ gpa_t _gpa =3D (gpa); \ void *_data =3D (data); \ int _len =3D (len); \ gfn_t _gfn =3D _gpa >> PAGE_SHIFT; \ int _offset =3D offset_in_page(_gpa); \ int _seg, _ret; \ while ((_seg =3D next_segment(_len, _offset)) !=3D 0) { \ _ret =3D (fn_page)(args##, _gfn, _data, _offset, _seg)= ; \ if (_ret < 0) \ break; \ _offset =3D 0; \ _len -=3D _seg; \ _data +=3D _seg; \ ++_gfn; \ } \ _ret; \ }) =2E.. int x86_write_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, const void *= data, int offset, unsigned long len) { return __kvm_write_guest_page(x86_gfn_to_memslot, gfn, data, offset, l= en, vcpu); } int x86_write_guest_cached(struct kvm_vcpu *vcpu, struct gfn_to_hva_cac= he *ghc, const void *data, unsigned long len) { return __kvm_write_guest_cached(x86_gfn_to_hva_cache_init, x86_write_guest, ghc, data, len, vcpu); } int x86_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, const void *data, unsigned long len) { return __kvm_write_guest(x86_write_guest_page, gpa, data, len, vcpu); } but frankly it seems worse than the disease. what about renaming and changing kvm_* memory function to > vcpu_* and create=20 > bool kvm_arch_vcpu_can_access_slot(vcpu, slot) > which could also be inline in arch/*/include/asm/kvm_host.h thanks to > the way we build. > We could be passing both kvm and vcpu in internal memslot operations = and > not checking if vcpu is NULL. This should allow all possible operati= ons > with little code duplication and the compiler could also optimize the > case where vcpu is NULL. That would be a huge patch, and most architectures do not (yet) need it= =2E I can change the functions to kvm_vcpu_read_* and when a second archite= cture needs it, we move it from arch/x86/kvm/ to virt/kvm. I named it x86_ j= ust because it was the same length as kvm_ and thus hardly needed reindenta= tion. > Another option is adding something like "vcpu kvm_arch_fake_vcpu(kvm)= " > for cases where the access doesn't have an associated vcpu, so it wou= ld > always succeed. (Might not be generic enough.) That's ugly... The question is also how often the copied code is changed, and the answ= er is that most of it was never changed since it was introduced in 2007 (commit 195aefde9cc2, "KVM: Add general accessors to read and write gue= st memory"). Before then, KVM used kmap_atomic directly! Only the cache code is more recent, but that also has only been changed= a couple of times after introducing it in 2010 (commit 49c7754ce570, "KVM= : Add memory slot versioning and use it to provide fast guest write inter= face"). It is very stable code. Paolo