From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751649AbbCJH5u (ORCPT ); Tue, 10 Mar 2015 03:57:50 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:34773 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750976AbbCJH5s (ORCPT ); Tue, 10 Mar 2015 03:57:48 -0400 Date: Tue, 10 Mar 2015 08:57:43 +0100 From: Ingo Molnar To: Denys Vlasenko Cc: Andy Lutomirski , Linus Torvalds , Steven Rostedt , Borislav Petkov , "H. Peter Anvin" , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] x86: entry_32.S: change ESPFIX test to not touch PT_OLDSS(%esp) Message-ID: <20150310075743.GA20041@gmail.com> References: <1425919450-19116-1-git-send-email-dvlasenk@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1425919450-19116-1-git-send-email-dvlasenk@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Denys Vlasenko wrote: > Old code was trying to avoid having three branch insns, > but instead it has a chain of six insns where each insn > depends on previos one. > > And it was touching PT_OLDSS(%esp) unconditionally, even when it may > contain bogus data. Elsewhere we have to jump thru hoops > just to make sure here PT_OLDSS(%esp) is at least in a valid page. > > All this just to have one branch instead of three? > > The new code simply checks each condition. > All three checks can run in parallel on an out-of-order CPU. > Most of the time, none of branches will be taken. > > Comparison of object code: > Old: > 1e6: 8b 44 24 38 mov 0x38(%esp),%eax > 1ea: 8a 64 24 40 mov 0x40(%esp),%ah > 1ee: 8a 44 24 34 mov 0x34(%esp),%al > 1f2: 25 03 04 02 00 and $0x20403,%eax > 1f7: 3d 03 04 00 00 cmp $0x403,%eax > 1fc: 74 0f je 20d > New: > 1e6: f6 44 24 3a 02 testb $0x2,0x3a(%esp) > 1eb: 75 0e jne 1fb > 1ed: f6 44 24 34 03 testb $0x3,0x34(%esp) > 1f2: 74 07 je 1fb > 1f4: f6 44 24 40 04 testb $0x4,0x40(%esp) > 1f9: 75 0f jne 20a Please do some benchmarking of this: a tight loop of getpid or getppid syscalls ought to be enough to be able to time this accurately. Thanks, Ingo