From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752856AbcAUW5R (ORCPT ); Thu, 21 Jan 2016 17:57:17 -0500 Received: from mx2.suse.de ([195.135.220.15]:57257 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752670AbcAUW5M (ORCPT ); Thu, 21 Jan 2016 17:57:12 -0500 Date: Thu, 21 Jan 2016 23:56:41 +0100 From: Borislav Petkov To: "H. Peter Anvin" Cc: Andy Lutomirski , Brian Gerst , the arch/x86 maintainers , Linux Kernel Mailing List , Ingo Molnar , Denys Vlasenko , Linus Torvalds Subject: Re: [PATCH] x86: static_cpu_has_safe: discard dynamic check after init Message-ID: <20160121225641.GC300@pd.tnic> References: <569D40CE.5090506@zytor.com> <20160118230554.GJ12651@pd.tnic> <3D4E057B-AB03-4C12-B59D-774E8954C742@zytor.com> <20160118232547.GK12651@pd.tnic> <20160119135714.GD15071@pd.tnic> <569F072B.1020504@zytor.com> <20160120103345.GA23350@pd.tnic> <108BC768-CF19-4F71-BF6D-70FF2252ADB8@zytor.com> <20160121221442.GB300@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 21, 2016 at 02:22:28PM -0800, H. Peter Anvin wrote: > Yes, having t_no as the fallthrough case ought to move the yes code > out of line. Dunno, maybe I'm doing something wrong: I have this change: --- diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 7f09de998c93..f9833fcb8fcb 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -175,10 +175,10 @@ static __always_inline __pure bool _static_cpu_has(u16 bit) [bitnum] "i" (1 << (bit & 7)), [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3]) : : t_yes, t_no); - t_yes: - return true; t_no: return false; + t_yes: + return true; } #define static_cpu_has(bit) \ --- and the resulting code looks even wrong (or my brain is fried for today - one of the two). vmlinux: ffffffff810046ae: e9 cc 0e de 00 jmpq ffffffff81de557f <__alt_instructions_end+0x7aa> ffffffff810046b3: 48 83 c4 18 add $0x18,%rsp ffffffff810046b7: 4c 89 e0 mov %r12,%rax ffffffff810046ba: 5b pop %rbx ffffffff810046bb: 41 5c pop %r12 ffffffff810046bd: 41 5d pop %r13 ffffffff810046bf: 41 5e pop %r14 ffffffff810046c1: 41 5f pop %r15 ffffffff810046c3: 5d pop %rbp ffffffff810046c4: c3 retq dynamic branch: ffffffff81de557f: f6 05 8f de d1 ff 01 testb $0x1,-0x2e2171(%rip) # ffffffff81b03415 ffffffff81de5586: 0f 85 f6 f1 21 ff jne ffffffff81004782 <__switch_to+0x332> ffffffff81de558c: e9 22 f1 21 ff jmpq ffffffff810046b3 <__switch_to+0x263> after X86_FEATURE_ALWAYS patching: [ 0.288007] apply_alternatives: feat: 3*32+21, old: (ffffffff810046ae, len: 5), repl: (ffffffff81de4dff, len: 5), pad: 0 [ 0.292004] ffffffff810046ae: old_insn: e9 cc 0e de 00 [ 0.300013] ffffffff81de4dff: rpl_insn: e9 af f8 21 ff [ 0.308006] recompute_jump: target RIP: ffffffff810046b3, new_displ: 0x5 [ 0.312006] recompute_jump: final displ: 0x00000003, JMP 0xffffffff810046b3 [ 0.316006] ffffffff810046ae: final_insn: eb 03 0f 1f 00 ffffffff810046ae: eb 03 0f 1f 00 jmp ffffffff810046b3 --- ffffffff810046b3: 48 83 c4 18 add $0x18,%rsp <-- ffffffff810046b7: 4c 89 e0 mov %r12,%rax ffffffff810046ba: 5b pop %rbx ffffffff810046bb: 41 5c pop %r12 ffffffff810046bd: 41 5d pop %r13 ffffffff810046bf: 41 5e pop %r14 ffffffff810046c1: 41 5f pop %r15 ffffffff810046c3: 5d pop %rbp ffffffff810046c4: c3 retq so this is silly: we're basically jumping after the JMP instruction itself. So that will be the case on !X86_BUG_SYSRET_SS_ATTRS CPUs. Still a two-byte and now even a useless JMP. The right thing to do would be to generate a NOP simply. On X86_BUG_SYSRET_SS_ATTRS CPUs: [ 0.322014] apply_alternatives: feat: 16*32+8, old: (ffffffff810046ae, len: 5), repl: (ffffffff81de3962, len: 0), pad: 0 [ 0.324005] ffffffff810046ae: old_insn: eb 03 0f 1f 00 [ 0.332006] ffffffff810046ae: final_insn: 0f 1f 44 00 00 ffffffff810046ae: 0f 1f 44 00 00 nop ffffffff810046b3: 48 83 c4 18 add $0x18,%rsp ffffffff810046b7: 4c 89 e0 mov %r12,%rax ffffffff810046ba: 5b pop %rbx ffffffff810046bb: 41 5c pop %r12 ffffffff810046bd: 41 5d pop %r13 ffffffff810046bf: 41 5e pop %r14 ffffffff810046c1: 41 5f pop %r15 ffffffff810046c3: 5d pop %rbp ffffffff810046c4: c3 retq which is actually even wrong! What it should've done is jne ffffffff81004782 as the dynamic code did. At that address we have the ss fixup: ffffffff81004782: 66 8c d0 mov %ss,%ax ffffffff81004785: 66 83 f8 18 cmp $0x18,%ax ffffffff81004789: 0f 84 24 ff ff ff je ffffffff810046b3 <__switch_to+0x263> ffffffff8100478f: b8 18 00 00 00 mov $0x18,%eax ffffffff81004794: 8e d0 mov %eax,%ss ffffffff81004796: e9 18 ff ff ff jmpq ffffffff810046b3 <__switch_to+0x263> with the jump back to the ret code. Which means, !X86_BUG_SYSRET_SS_ATTRS CPUs get to do a forward and a backward JMP. So even if it did the right thing, it would be two JMPs. Meh. I need to think about something better. -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --