From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ben Catterall <Ben.Catterall@citrix.com>
Subject: Re: [PATCH RFC v2 0/4] HVM x86 deprivileged mode
	operations
Date: Tue, 8 Sep 2015 11:58:05 +0100
Message-ID: <55EEBF3D.9000909@citrix.com>
References: <1441296086-18209-1-git-send-email-Ben.Catterall@citrix.com>
	<55E9769D.30206@m2r.biz>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta5.messagelabs.com ([195.245.231.135])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <prvs=686e157d6=ben.catterall@citrix.com>)
	id 1ZZGbk-0000in-SL
	for xen-devel@lists.xenproject.org; Tue, 08 Sep 2015 10:58:37 +0000
In-Reply-To: <55E9769D.30206@m2r.biz>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Fabio Fantoni <fabio.fantoni@m2r.biz>, xen-devel@lists.xenproject.org
Cc: keir@xen.org, jbeulich@suse.com, george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com, tim@xen.org, Aravind.Gopalakrishnan@amd.com, suravee.suthikulpanit@amd.com, boris.ostrovsky@oracle.com, ian.campbell@citrix.com
List-Id: xen-devel@lists.xenproject.org


Hi Fabio,

On 04/09/15 11:46, Fabio Fantoni wrote:
[snip]
>
> Sorry for my stupid questions:
> Is there a test with benchmark using qemu instead for know how is
> different? Qemu seems that emulate also some istructions cases that xen
> hypervisor doesn't for now, or I'm wrong?
>
So, QEMU emulates devices for HVM guests. Now, letting the portio 
operation through to QEMU to emulate takes about 20e-6 seconds. But, 
that includes the time QEMU takes to actually emulate the port operation 
so is not the 'pure' overhead. I need to do more detailed analysis to 
get that figure.

> Is there any possible hardware technology or set of instructions for
> improve the operations also deprivileged or transition from Xen is
> obliged to control even mappings memory access?

We're using sysret and syscall already to do the transition which are 
the fast system call operations. I don't have actual benchmark values 
for their execution time though. We map the depriv code, stack and data 
sections into the monitor table when initialising the HVM guest (user 
mode mapping) so Xen doesn't need to worry about those mappings whilst 
executing a depriv operation.

> Is there any possible future hardware technology or set of instructions
> for take needed informations from hypervisor for executing directly all
> needed checks, them if ok and any possible exceptions/protections or
> delegate this to xen for each instruction with a tremendous impact on
> the efficiency can not be improved?
I'm not quite sure what you're asking here, sorry! Are you asking if we 
can take an HVM guest instruction, analyse it to determine if it's safe 
to execute and then execute it rather than emulating it? If so:

QEMU handles device emulation and this is deliberately not done in Xen 
to reduce the attack surface of the hypervisor and keep it minimal. We 
do need to analyse instructions at some points (x86 emulate) but this is 
error prone (there's a paper or two on exploits of this feature). This 
is one of the reasons for considering a depriv mode in the first pace, 
by moving such code into a deprivileged area, we can prevent a bug in 
this code from leading to hypervisor compromise. I'm not aware of any 
future hardware or set of instructions but that doesn't mean there 
aren't/won't be!

> If I said only absurd things because of my knowledge too low about sorry
> for having wasted your time.
>
> Thanks for any reply and sorry for my bad english.
np, I hope I've understood correctly!
>
>>
>> Performance testing
>> -------------------
>> Performance testing indicates that the overhead for this deprivileged
>> mode
>> depend heavily upon the processor. This overhead is the cost of moving
>> into
>> deprivileged mode and then fully back out of deprivileged mode. The
>> conclusions
>> are that the overheads are not negligible and that operations using this
>> mechanism would benefit from being long running or be high risk
>> components. It
>> will need to be evaluated on a case-by-case basis.
>>
>> I performed 100000 writes to a single I/O port on an Intel 2.2GHz Xeon
>> E5-2407 0 processor and an AMD Opteron 2376. This was done from a
>> python script
>> within the HVM guest using time.time() and running Debian Jessie. Each
>> write was
>> trapped to cause a vmexit and the time for each write was calculated.
>> The port
>> operation is bypassed so that no portio is actually performed. Thus, the
>> differences in the measurements below can be taken as the pure
>> overhead. These
>> experiments were repeated. Note that only the host and this HVM guest
>> were
>> running (both Debian Jessie) during the experiments.
>>
>> Intel Intel 2.2GHz Xeon E5-2407 0 processor:
>> --------------------------------------------
>> 1.55e-06 seconds was the average time for performing the write without
>> the
>>           deprivileged code running.
>>
>> 5.75e-06 seconds was the average time for performing the write with the
>>           deprivileged code running.
>>
>> So approximately 351% overhead
>>
>> AMD Opteron 2376:
>> -----------------
>> 1.74e-06 seconds was the average time for performing the write without
>> the
>>           deprivileged code running.
>> 3.10e-06 seconds was the average time for performing the write with an
>> entry and
>>           exit from deprvileged mode.
>>
>> So approximately 178% overhead.
>>
>> Signed-off-by: Ben Catterall <Ben.Catterall@citrix.com>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>