From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation Date: Tue, 23 Jan 2018 11:23:18 +0100 Message-ID: <20180123102318.airsvcl5uckguo2z@gmail.com> References: <1516476182-5153-10-git-send-email-karahmed@amazon.de> <1516566497.9814.78.camel@infradead.org> <1516572013.9814.109.camel@infradead.org> <1516638426.9521.20.camel@infradead.org> <20180123072930.soz25cyky3u4hpgv@gmail.com> <20180123075358.nztpyxympwfkyi2a@gmail.com> <1516699832.9521.123.camel@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Cc: Linus Torvalds , KarimAllah Ahmed , Linux Kernel Mailing List , Andi Kleen , Andrea Arcangeli , Andy Lutomirski , Arjan van de Ven , Ashok Raj , Asit Mallick , Borislav Petkov , Dan Williams , Dave Hansen , Greg Kroah-Hartman , "H . Peter Anvin" , Ingo Molnar , Janakarajan Natarajan , Joerg Roedel , Jun Nakajima , Laura Abbott , To: David Woodhouse Return-path: Content-Disposition: inline In-Reply-To: <1516699832.9521.123.camel@infradead.org> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org * David Woodhouse wrote: > > On SkyLake this would add an overhead of maybe 2-3 cycles per function call and  > > obviously all this code and data would be very cache hot. Given that the average  > > number of function calls per system call is around a dozen, this would be _much_  > > faster than any microcode/MSR based approach. > > That's kind of neat, except you don't want it at the top of the > function; you want it at the bottom. > > If you could hijack the *return* site, then you could check for > underflow and stuff the RSB right there. But in __fentry__ there's not > a lot you can do other than complain that something bad is going to > happen in the future. You know that a string of 16+ rets is going to > happen, but you've got no gadget in *there* to deal with it when it > does. No, it can be done with the existing CALL instrumentation callback that CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from the CALL trampoline - see my previous email. > HJ did have patches to turn 'ret' into a form of retpoline, which I > don't think ever even got performance-tested. Return instrumentation is possible as well, but there are two major drawbacks: - GCC support for it is not as widely available and return instrumentation is less tested in Linux kernel contexts - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already enabled in distros here and today, so the runtime overhead to non-SkyLake CPUs would be literally zero, while still allowing to fix the RSB vulnerability on SkyLake. Thanks, Ingo