From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965083AbeAJKFH (ORCPT + 1 other); Wed, 10 Jan 2018 05:05:07 -0500 Received: from bombadil.infradead.org ([65.50.211.133]:42575 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754201AbeAJKFE (ORCPT ); Wed, 10 Jan 2018 05:05:04 -0500 Date: Wed, 10 Jan 2018 11:04:57 +0100 From: Peter Zijlstra To: Tim Chen Cc: Thomas Gleixner , Andy Lutomirski , Linus Torvalds , Greg KH , Dave Hansen , Andrea Arcangeli , Andi Kleen , Arjan Van De Ven , David Woodhouse , Dan Williams , Paolo Bonzini , Ashok Raj , linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 3/5] x86/enter: Use IBRS on syscall and interrupts Message-ID: <20180110100457.GA29822@worktop.programming.kicks-ass.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Tue, Jan 09, 2018 at 06:26:47PM -0800, Tim Chen wrote: > Set IBRS upon kernel entrance via syscall and interrupts. Clear it > upon exit. IBRS protects against unsafe indirect branching predictions > in the kernel. > > The NMI interrupt save/restore of IBRS state was based on Andrea > Arcangeli's implementation. > Here's an explanation by Dave Hansen on why we save IBRS state for NMI. > > The normal interrupt code uses the 'error_entry' path which uses the > Code Segment (CS) of the instruction that was interrupted to tell > whether it interrupted the kernel or userspace and thus has to switch > IBRS, or leave it alone. > > The NMI code is different. It uses 'paranoid_entry' because it can > interrupt the kernel while it is running with a userspace IBRS (and %GS > and CR3) value, but has a kernel CS. If we used the same approach as > the normal interrupt code, we might do the following; > > SYSENTER_entry > <-------------- NMI HERE > IBRS=1 > do_something() > IBRS=0 > SYSRET > > The NMI code might notice that we are running in the kernel and decide > that it is OK to skip the IBRS=1. This would leave it running > unprotected with IBRS=0, which is bad. > > However, if we unconditionally set IBRS=1, in the NMI, we might get the > following case: > > SYSENTER_entry > IBRS=1 > do_something() > IBRS=0 > <-------------- NMI HERE (set IBRS=1) > SYSRET > > and we would return to userspace with IBRS=1. Userspace would run > slowly until we entered and exited the kernel again. > > Instead of those two approaches, we chose a third one where we simply > save the IBRS value in a scratch register (%r13) and then restore that > value, verbatim. > What this Changelog fails to address is _WHY_ we need this. What does this provide that retpoline does not.