From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754993AbcEBScb (ORCPT ); Mon, 2 May 2016 14:32:31 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:35442 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754740AbcEBScV (ORCPT ); Mon, 2 May 2016 14:32:21 -0400 Date: Mon, 2 May 2016 20:32:16 +0200 From: Ingo Molnar To: Dave Hansen Cc: Andy Lutomirski , Yu-cheng Yu , X86 ML , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , "linux-kernel@vger.kernel.org" , Andy Lutomirski , Borislav Petkov , Sai Praneeth Prakhya , "Ravi V. Shankar" , Fenghua Yu Subject: Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Message-ID: <20160502183216.GA16100@gmail.com> References: <5723A353.7060209@linux.intel.com> <20160429195741.GA15402@test-lenovo> <5723BE1F.7040300@linux.intel.com> <20160429200709.GA15412@test-lenovo> <5723C6A7.4020704@linux.intel.com> <20160430075343.GA23063@gmail.com> <57278014.3050808@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <57278014.3050808@linux.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Dave Hansen wrote: > On 04/30/2016 12:53 AM, Ingo Molnar wrote: > > We can still use the compacted area handling instructions, because presumably > > those are the fastest and are also the most optimized ones? But I wouldn't use > > them to do dynamic allocation: just allocate the maximum possible FPU save area at > > task creation time and never again worry about that detail. > > > > Ok? > > Sounds sane to me. > > BTW, I hacked up your "fpu performance" to compare XSAVE vs. XSAVES: > > > [ 0.048347] x86/fpu: Cost of: XSAVE insn : 127 cycles > > [ 0.049134] x86/fpu: Cost of: XSAVES insn : 113 cycles > > [ 0.048492] x86/fpu: Cost of: XRSTOR insn : 120 cycles > > [ 0.049267] x86/fpu: Cost of: XRSTORS insn : 102 cycles > > So I guess we can add that to the list of things that XSAVES is good for. Absolutely! > [...] Granted, the real-world benefit is probably hard to measure because the > cache residency of the XSAVE buffer isn't as good when _actually_ context > switching, but this at least shows a small theoretical advantage for XSAVES. Yeah, and anything that was measured for real is far from being theoretical. It's simply a best-case microbenchmark figure, but it's still a nice 10+ cycles improvement overall - which might become bigger in future CPU generations. Thanks, Ingo