From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Han, Huaitong" Subject: Re: [V3 PATCH 7/9] x86/hvm: pkeys, add pkeys support for guest_walk_tables Date: Thu, 17 Dec 2015 09:18:56 +0000 Message-ID: <1450343942.3809.8.camel@intel.com> References: <1449479780-19146-1-git-send-email-huaitong.han@intel.com> <1449479780-19146-8-git-send-email-huaitong.han@intel.com> <5669C23F.6080203@citrix.com> <566AA4D702000078000BE866@prv-mh.provo.novell.com> <1450167259.10563.9.camel@intel.com> <566FE51902000078000BF67A@prv-mh.provo.novell.com> <1450253821.4539.3.camel@intel.com> <56712FAB02000078000C004B@prv-mh.provo.novell.com> <1450256625.4539.14.camel@intel.com> <5671392502000078000C0093@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5671392502000078000C0093@prv-mh.provo.novell.com> Content-Language: en-US Content-ID: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "JBeulich@suse.com" Cc: "Tian, Kevin" , "wei.liu2@citrix.com" , "ian.campbell@citrix.com" , "stefano.stabellini@eu.citrix.com" , "george.dunlap@eu.citrix.com" , "andrew.cooper3@citrix.com" , "ian.jackson@eu.citrix.com" , "george.dunlap@citrix.com" , "xen-devel@lists.xen.org" , "Nakajima, Jun" , "keir@xen.org" List-Id: xen-devel@lists.xenproject.org On Wed, 2015-12-16 at 02:12 -0700, Jan Beulich wrote: > > > > On 16.12.15 at 10:03, wrote: > > On Wed, 2015-12-16 at 01:32 -0700, Jan Beulich wrote: > > > > > > On 16.12.15 at 09:16, wrote: > > > > On Tue, 2015-12-15 at 02:02 -0700, Jan Beulich wrote: > > > > > Well, I wouldn't want you to introduce a brand new function, > > > > > but > > > > > instead just factor out the necessary piece from xsave() > > > > > (making > > > > > the new one take a struct xsave_struct * instead of a struct > > > > > vcpu > > > > > *, > > > > > and calling it from what is now xsave()). > > > > So the function looks like this: > > > > unsigned int get_xsave_pkru(struct vcpu *v) > > > > { > > > > void *offset; > > > > struct xsave_struct *xsave_area; > > > > uint64_t mask = XSTATE_PKRU; > > > > unsigned int index = fls64(mask) - 1; > > > > unsigned int pkru = 0; > > > > > > > > if ( !cpu_has_xsave ) > > > > return 0; > > > > > > > > BUG_ON(xsave_cntxt_size < XSTATE_AREA_MIN_SIZE); > > > > xsave_area = _xzalloc(xsave_cntxt_size, 64); > > > > if ( xsave_area == NULL ) > > > > return 0; > > > > > > > > xsave(xsave_area, mask); > > > > offset = (void *)xsave_area + (xsave_area_compressed(xsave) > > > > ? > > > > XSTATE_AREA_MIN_SIZE : xstate_offsets[index] ); > > > > memcpy(&pkru, offset, sizeof(pkru)); > > > > > > > > xfree(xsave_area); > > > > > > > > return pkru; > > > > } > > > > > > Depending on how frequently this might get called, the allocation > > > overhead may not be tolerable. I.e. you may want to set up e.g. > > > a per-CPU buffer up front. Or you check whether using RDPKRU > > > (with temporarily setting CR4.PKE) is cheaper than what you > > > do right now. > > RDPKRU does cost less than the function, and if temporarily setting > > CR4.PKE is accepted, I will use RDPKRU instead of the function. > > The question isn't just the RDPKRU cost, but that of the two CR4 > writes. Testing result with NOW() function: Time of the function 10 times execution (XEN)xsave time is 1376 ns (XEN)read_pkru time is 28 ns So, read_pkru function does cost less than get_xsave_pkru, and I will use read_pkru. Testing codes: static void reboot_machine(unsigned char key, struct cpu_user_regs *regs) { - printk("'%c' pressed -> rebooting machine\n", key); - machine_restart(0); + // printk("read pkru test\n", key); + s_time_t pre, last; + unsigned int pkru = 0; + unsigned int i = 0; + void * offset; + struct xsave_struct *save_area = _xzalloc(1024, 64); + + pre = NOW(); + for (;i< 10;i++){ + xsave(current, XSTATE_PKRU, save_area); + offset = (void *)save_area + XSTATE_AREA_MIN_SIZE; + memcpy(&pkru, offset, sizeof(pkru));} + last = NOW(); + printk("xsave time is %lu\n", (last - pre)); + + pre = NOW(); + for (;i< 10;i++){ + pkru = read_pkru(); + } + last = NOW(); + printk("read_pkru time is %lu\n", (last - pre)); + // machine_restart(0); } static inline unsigned int read_pkru(void) { unsigned int pkru; set_in_cr4(X86_CR4_PKE); asm volatile (".byte 0x0f,0x01,0xee" : "=a" (pkru) : "c" (0) : "dx"); clear_in_cr4(X86_CR4_PKE); return pkru; } > > Jan >