From mboxrd@z Thu Jan 1 00:00:00 1970 From: Valentine Sinitsyn Subject: Re: SVM: vmload/vmsave-free VM exits? Date: Tue, 07 Apr 2015 11:10:43 +0500 Message-ID: <552374E3.1060707@gmail.com> References: <5520F2C8.7090102@web.de> <55216CE5.9000504@gmail.com> <55236E6F.7090705@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed To: Jan Kiszka , kvm , Jailhouse Return-path: In-Reply-To: <55236E6F.7090705@web.de> List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , List-Id: kvm.vger.kernel.org Hi Jan, On 07.04.2015 10:43, Jan Kiszka wrote: > On 2015-04-05 19:12, Valentine Sinitsyn wrote: >> Hi Jan, >> >> On 05.04.2015 13:31, Jan Kiszka wrote: >>> studying the VM exit logic of Jailhouse, I was wondering when AMD's >>> vmload/vmsave can be avoided. Jailhouse as well as KVM currently use >>> these instructions unconditionally. However, I think both only need >>> GS.base, i.e. the per-cpu base address, to be saved and restored if no >>> user space exit or no CPU migration is involved (both is always true for >>> Jailhouse). Xen avoids vmload/vmsave on lightweight exits but it also >>> still uses rsp-based per-cpu variables. >>> >>> So the question boils down to what is generally faster: >>> >>> A) vmload >>> vmrun >>> vmsave >>> >>> B) wrmsrl(MSR_GS_BASE, guest_gs_base) >>> vmrun >>> rdmsrl(MSR_GS_BASE, guest_gs_base) >>> >>> Of course, KVM also has to take into account that heavyweight exits >>> still require vmload/vmsave, thus become more expensive with B) due to >>> the additional MSR accesses. >>> >>> Any thoughts or results of previous experiments? >> That's a good question, I also thought about it when I was finalizing >> Jailhouse AMD port. I tried "lightweight exits" with apic-demo but it >> didn't seem to affect the latency in any noticeable way. That's why I >> decided not to push the patch (in fact, I was even unable to find it now). >> >> Note however that how AMD chips store host state during VM switches are >> implementation-specific. I did my quick experiments on one CPU only, so >> your mileage may vary. >> >> Regarding your question, I feel B will be faster anyways but again I'm >> afraid that the gain could be within statistical error of the experiment. > > It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, more > towards 600 if they are colder (added some usleep to each loop in the test). Great, thanks. Could you post absolute numbers, i.e how long do A and B take on your CPU? Valentine -- You received this message because you are subscribed to the Google Groups "Jailhouse" group. To unsubscribe from this group and stop receiving emails from it, send an email to jailhouse-dev+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.