From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <542C2916.7010500@zytor.com> Date: Wed, 01 Oct 2014 09:17:26 -0700 From: "H. Peter Anvin" MIME-Version: 1.0 To: Andy Lutomirski CC: Sebastian Lackner , X86 ML , Thomas Gleixner , Anish Bhatt , Ingo Molnar , "linux-kernel@vger.kernel.org" , Chuck Ebbert , stable Subject: Re: [PATCH v2 1/2] x86_64,entry: Filter RFLAGS.NT on entry from userspace References: <0e906bdeba3660c9766248d3d7229e78a423ca5b.1412138935.git.luto@amacapital.net> <542C1C28.9050408@zytor.com> <542C1D2E.9050005@zytor.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: On 10/01/2014 09:04 AM, Andy Lutomirski wrote: > > Agner Fog's tables for Sandy Bridge have 9 uops for popf and > reciprocal throughput 18. sti isn't listed for Sandy Bridge or > anything similar, but cld is 3 uops with reciprocal throughput 4. > Also, popf accesses rsp, and the sysenter code is very heavy on stack > manipulation. > It does a stack operation. Newer CPUs optimize stack accesses pretty heavily. That doesn't mean back-to-back push/pop are all that optimized, I wonder if it would help separating them. popf is unlikely to ever be all that fast. -hpa