From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [RFC PATCH 0/3] kernel: add support for 256-bit IO access Date: Thu, 22 Mar 2018 10:36:39 +0100 Message-ID: <20180322093639.ierhvktujyfozb33@gmail.com> References: <7f0ddb3678814c7bab180714437795e0@AcuMS.aculab.com> <7f8d811e79284a78a763f4852984eb3f@AcuMS.aculab.com> <20180320082651.jmxvvii2xvmpyr2s@gmail.com> <20180321063256.bdqcpvgb3auxzwzk@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Linus Torvalds , Thomas Gleixner , David Laight , Rahul Lakkireddy , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "mingo@redhat.com" , "hpa@zytor.com" , "davem@davemloft.net" , "akpm@linux-foundation.org" , "ganeshgr@chelsio.com" , "nirranjan@chelsio.com" , "indranil@chelsio.com" , Peter Zijlstra , Fenghua Yu , Eric Biggers To: Andy Lutomirski Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org * Andy Lutomirski wrote: > On Wed, Mar 21, 2018 at 6:32 AM, Ingo Molnar wrote: > > > > * Linus Torvalds wrote: > > > >> And even if you ignore that "maintenance problems down the line" issue > >> ("we can fix them when they happen") I don't want to see games like > >> this, because I'm pretty sure it breaks the optimized xsave by tagging > >> the state as being dirty. > > > > That's true - and it would penalize the context switch cost of the affected task > > for the rest of its lifetime, as I don't think there's much that clears XINUSE > > other than a FINIT, which is rarely done by user-space. > > > >> So no. Don't use vector stuff in the kernel. It's not worth the pain. > > > > I agree, but: > > > >> The *only* valid use is pretty much crypto, and even there it has had issues. > >> Benchmarks use big arrays and/or dense working sets etc to "prove" how good the > >> vector version is, and then you end up in situations where it's used once per > >> fairly small packet for an interrupt, and it's actually much worse than doing it > >> by hand. > > > > That's mainly because the XSAVE/XRESTOR done by kernel_fpu_begin()/end() is so > > expensive, so this argument is somewhat circular. > > If we do the deferred restore, then the XSAVE/XRSTOR happens at most > once per kernel entry, which isn't so bad IMO. Also, with PTI, kernel > entries are already so slow that this will be mostly in the noise :( For performance/scalability work we should just ignore the PTI overhead: it doesn't exist on AMD CPUs and Intel has announced Meltdown-fixed CPUs, to be released later this year: https://www.anandtech.com/show/12533/intel-spectre-meltdown By the time any kernel changes we are talking about today get to distros and users the newest hardware won't have the Meltdown bug. Thanks, Ingo