From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH 1/4] VMX: streamline entry.S code Date: Mon, 26 Aug 2013 14:22:39 +0100 Message-ID: <521B569F.1040408@citrix.com> References: <521786A002000078000EE064@nat28.tlf.novell.com> <5217876402000078000EE077@nat28.tlf.novell.com> <521B3197.8060408@citrix.com> <521B51B702000078000EE53E@nat28.tlf.novell.com> <521B40A4.2090704@citrix.com> <521B704C02000078000EE661@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VDwkk-0004tv-MZ for xen-devel@lists.xenproject.org; Mon, 26 Aug 2013 13:22:42 +0000 In-Reply-To: <521B704C02000078000EE661@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel , Keir Fraser , Eddie Dong , Jun Nakajima List-Id: xen-devel@lists.xenproject.org On 26/08/2013 14:12, Jan Beulich wrote: >>>> On 26.08.13 at 13:48, Andrew Cooper wrote: >> On 26/08/2013 12:01, Jan Beulich wrote: >>>>> -.globl vmx_asm_do_vmentry >>>>> -vmx_asm_do_vmentry: >>>> If you move the ENTRY(vmx_asm_do_vmentry) up from below, you should be >>>> able to completely drop the jmp in it. >>> That would be possible, at the expense of added padding. I prefer >>> it the way it is now, as vmx_asm_do_vmentry is not performance >>> critical (as being used exactly once per HVM vCPU). >> There are a number of places where we have ENTRY()-like constructs but >> don't want the padding with it. >> >> Would an __ENTRY() macro go down well? I can spin a patch for it. > x86 Linux has GLOBAL() for that purpose - I'd like this better than > __ENTRY() both from a name space perspective and from > describing its purpose. Ok - I will spin a patch. > >> My point about re-executing it does still apply. Looking at the code, I >> do not believe it is correct to be executing vmx_intr_assist or >> nvmx_switch_guest multiple times on a context switch to an HVM VCPU. >> vmx_intr_assist at the very least has a huge amount of work to do before >> it considers exiting. >> >> It does appear that there is possible interaction between do_softirq() >> and vmx_intr_assist(), at which point vmx_intr_assist() should be run >> after do_softirq(), which removes the apparently redundant run with >> interrupts enabled. > None of this seems related to the patch anymore - if you think > there's more stuff that needs changing, let's discuss this in a > separate thread. Certainly. > >>> The %cr2 write's move is indeed debatable - I tried to get it farther >>> away from the producer of the data in %rax, but it's not clear >>> whether that's very useful. The second purpose was to get >>> something interleaved with the many "pop"s, so that the CPU can >>> get busy other than just its memory load ports. If controversial >>> I'm fine with undoing that change. >> From my understanding of a serialising instruction, it forces the >> completion of all previous instructions before starting, and prevents >> the issue of any subsequent instructions until it itself has completed. >> >> Therefore, I doubt it has the intended effect. > Wait - this is again also a separation from the producer of the > data. Whether modern CPUs can deal with that I'm not sure, > but it surely doesn't hurt to hide eventual latency. > > Jan > For non-serialising instructions, it is a good idea (and likely some a compiler would anyway). Moving the GET_CURRENT() will probably be quite effective as most subsequent instructions depend on it. Serialising instructions on the other hand will not be affected by these issues (given their nature), but I would prefer to defer judgement to someone who has a better idea of the microarchitectural implications. Either way, as the concerns are now just down to playing with the optimal static instruction scheduling, Reviewed-by: Andrew Cooper