From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Catterall Subject: Re: [PATCH RFC v2 0/4] HVM x86 deprivileged mode operations Date: Tue, 8 Sep 2015 11:58:05 +0100 Message-ID: <55EEBF3D.9000909@citrix.com> References: <1441296086-18209-1-git-send-email-Ben.Catterall@citrix.com> <55E9769D.30206@m2r.biz> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZZGbk-0000in-SL for xen-devel@lists.xenproject.org; Tue, 08 Sep 2015 10:58:37 +0000 In-Reply-To: <55E9769D.30206@m2r.biz> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Fabio Fantoni , xen-devel@lists.xenproject.org Cc: keir@xen.org, jbeulich@suse.com, george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com, tim@xen.org, Aravind.Gopalakrishnan@amd.com, suravee.suthikulpanit@amd.com, boris.ostrovsky@oracle.com, ian.campbell@citrix.com List-Id: xen-devel@lists.xenproject.org Hi Fabio, On 04/09/15 11:46, Fabio Fantoni wrote: [snip] > > Sorry for my stupid questions: > Is there a test with benchmark using qemu instead for know how is > different? Qemu seems that emulate also some istructions cases that xen > hypervisor doesn't for now, or I'm wrong? > So, QEMU emulates devices for HVM guests. Now, letting the portio operation through to QEMU to emulate takes about 20e-6 seconds. But, that includes the time QEMU takes to actually emulate the port operation so is not the 'pure' overhead. I need to do more detailed analysis to get that figure. > Is there any possible hardware technology or set of instructions for > improve the operations also deprivileged or transition from Xen is > obliged to control even mappings memory access? We're using sysret and syscall already to do the transition which are the fast system call operations. I don't have actual benchmark values for their execution time though. We map the depriv code, stack and data sections into the monitor table when initialising the HVM guest (user mode mapping) so Xen doesn't need to worry about those mappings whilst executing a depriv operation. > Is there any possible future hardware technology or set of instructions > for take needed informations from hypervisor for executing directly all > needed checks, them if ok and any possible exceptions/protections or > delegate this to xen for each instruction with a tremendous impact on > the efficiency can not be improved? I'm not quite sure what you're asking here, sorry! Are you asking if we can take an HVM guest instruction, analyse it to determine if it's safe to execute and then execute it rather than emulating it? If so: QEMU handles device emulation and this is deliberately not done in Xen to reduce the attack surface of the hypervisor and keep it minimal. We do need to analyse instructions at some points (x86 emulate) but this is error prone (there's a paper or two on exploits of this feature). This is one of the reasons for considering a depriv mode in the first pace, by moving such code into a deprivileged area, we can prevent a bug in this code from leading to hypervisor compromise. I'm not aware of any future hardware or set of instructions but that doesn't mean there aren't/won't be! > If I said only absurd things because of my knowledge too low about sorry > for having wasted your time. > > Thanks for any reply and sorry for my bad english. np, I hope I've understood correctly! > >> >> Performance testing >> ------------------- >> Performance testing indicates that the overhead for this deprivileged >> mode >> depend heavily upon the processor. This overhead is the cost of moving >> into >> deprivileged mode and then fully back out of deprivileged mode. The >> conclusions >> are that the overheads are not negligible and that operations using this >> mechanism would benefit from being long running or be high risk >> components. It >> will need to be evaluated on a case-by-case basis. >> >> I performed 100000 writes to a single I/O port on an Intel 2.2GHz Xeon >> E5-2407 0 processor and an AMD Opteron 2376. This was done from a >> python script >> within the HVM guest using time.time() and running Debian Jessie. Each >> write was >> trapped to cause a vmexit and the time for each write was calculated. >> The port >> operation is bypassed so that no portio is actually performed. Thus, the >> differences in the measurements below can be taken as the pure >> overhead. These >> experiments were repeated. Note that only the host and this HVM guest >> were >> running (both Debian Jessie) during the experiments. >> >> Intel Intel 2.2GHz Xeon E5-2407 0 processor: >> -------------------------------------------- >> 1.55e-06 seconds was the average time for performing the write without >> the >> deprivileged code running. >> >> 5.75e-06 seconds was the average time for performing the write with the >> deprivileged code running. >> >> So approximately 351% overhead >> >> AMD Opteron 2376: >> ----------------- >> 1.74e-06 seconds was the average time for performing the write without >> the >> deprivileged code running. >> 3.10e-06 seconds was the average time for performing the write with an >> entry and >> exit from deprvileged mode. >> >> So approximately 178% overhead. >> >> Signed-off-by: Ben Catterall >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >