From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965029AbbD0Sjm (ORCPT ); Mon, 27 Apr 2015 14:39:42 -0400 Received: from mail.skyhub.de ([78.46.96.112]:36286 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964882AbbD0SjM (ORCPT ); Mon, 27 Apr 2015 14:39:12 -0400 Date: Mon, 27 Apr 2015 20:38:54 +0200 From: Borislav Petkov To: Linus Torvalds Cc: Andy Lutomirski , Andy Lutomirski , X86 ML , "H. Peter Anvin" , Denys Vlasenko , Brian Gerst , Denys Vlasenko , Ingo Molnar , Steven Rostedt , Oleg Nesterov , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , Linux Kernel Mailing List Subject: Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue Message-ID: <20150427183854.GG28871@pd.tnic> References: <5d120f358612d73fc909f5bfa47e7bd082db0af0.1429841474.git.luto@kernel.org> <20150425211206.GE32099@pd.tnic> <20150427085305.GB6774@pd.tnic> <20150427113506.GG6774@pd.tnic> <20150427154631.GB28871@pd.tnic> <20150427164024.GD28871@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 27, 2015 at 11:14:15AM -0700, Linus Torvalds wrote: > Btw, please don't use the "more than three 66h overrides" version. Oh yeah, a notorious "frontend choker". > Sure, that's what the optimization manual suggests if you want > single-instruction decode for all sizes up to 15 bytes, but I think > we're better off with the two-nop case for sizes 12-15) (4-byte nop > followed by 8-11 byte nop). Yeah, so says the manual. Although I wouldn't trust those manuals blindly but that's another story. > Because the "more than three 66b prefixes" really performs abysmally > on some cores, iirc. Right. So our current NOP-infrastructure does ASM_NOP_MAX NOPs of 8 bytes so without more invasive changes, our longest NOPs are 8 byte long and then we have to repeat. This is consistent with what the code looks like here after alternatives application: ffffffff815b9084 : ... ffffffff815b90ac: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) ffffffff815b90b3: 00 ffffffff815b90b4: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) ffffffff815b90bb: 00 ffffffff815b90bc: 90 nop You can recognize the p6_nops being the same as in-the-manual-suggested F16h ones. :-) I'm running them now and will report numbers relative to the last run once it is done. And those numbers should in practice get even better if we revert to the simpler canonical-ness check but let's see... Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --