From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH v1 1/6] x86: Add support for STAC/CLAC instructions Date: Tue, 22 Apr 2014 10:43:03 +0100 Message-ID: <535639A7.7070006@citrix.com> References: <1397566876-19631-1-git-send-email-feng.wu@intel.com> <534D0C830200007800008D72@nat28.tlf.novell.com> <53563F4D020000780000A927@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <53563F4D020000780000A927@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Eddie Dong , Feng Wu , "Ian.Campbell@citrix.com" , Jun Nakajima , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 22/04/14 09:07, Jan Beulich wrote: >>>> On 22.04.14 at 09:41, wrote: >>> From: Jan Beulich [mailto:JBeulich@suse.com] >>> That said, the macro contents itself is horrible too: A control register >>> access and two conditional branches in code intended to be used in >>> fast paths? Definitely not an option. Even the simplest possible >>> solution - adding a global flag to be checked here - would already be >>> questionable. Hence I think you should at least consider porting over >>> proper instruction patching abstraction from Linux. >>> >> Jan, I did some investigation about how to handle this two instructions >> in Linux, basically, it uses the alternatives mechanism to handle these >> kind of cases. Let's take the following definition of ASM_STAC in Linux for >> example: >> >> #define ASM_CLAC \ >> 661: ASM_NOP3 ; \ >> .pushsection .altinstr_replacement, "ax" ; \ >> 662: __ASM_CLAC ; \ >> .popsection ; \ >> .pushsection .altinstructions, "a" ; \ >> altinstruction_entry 661b, 662b, X86_FEATURE_SMAP, 3, 3 ; \ >> .popsection >> >> ASM_CLAC is defined as NOP by default, it puts the real CLAC instruction in >> section "altinstr_replacement" and >> the needed information to " altinstructions " section, which is useful to >> replace the default >> definition by the alternative one. Here is the routine call path: >> start_kernel () --> check_bugs() --> alternative_instructions(). >> >> In function alternative_instructions(), it will check the related features >> in CPU, if it exists, the alternative definition will >> overwrite the default one. So there is no conditional branches after this >> replacement when the Macro is being used. >> >> Do you think we need to port this whole mechanism to Xen to support >> CLAC/STAC? I am not sure if it is a little overkilled. > Obviously we could use this machinery for other things. But whether it's > needed here depends on the alternatives. > >> BTW, from the Linux implementation, I think we don't need to check the 'cr4' >> for the macros, we just need >> to check whether the feature exists in the CPU. So is it acceptable to use >> the original code by eliminating the cr4 check? > That _might_ be acceptable if you bring it down to just the three > really necessary instructions: BT, JNC, CLAC/STAC. But the "might" > has to stand - this, after all, remains an addition of a conditional > branch (and for the performance of STAC/CLAC I haven't seen any > documentation so far either) to several fast paths, and hence the > patching alternative can't be discarded as the potentially better one. > > Jan copy_{to,from}_guest() are already long paths (particularly for HVM) so a single extra conditional is not going to be too bad (and as after boot it will remain constant, the branch predictor will have a reliable time with it). It would certainly be fine for a v1 to get SMAP support working. ~Andrew