From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754498AbcEBQ2T (ORCPT ); Mon, 2 May 2016 12:28:19 -0400 Received: from mga02.intel.com ([134.134.136.20]:36286 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754039AbcEBQ2G (ORCPT ); Mon, 2 May 2016 12:28:06 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,568,1455004800"; d="scan'208";a="797016190" Subject: Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues To: Ingo Molnar References: <5723A353.7060209@linux.intel.com> <20160429195741.GA15402@test-lenovo> <5723BE1F.7040300@linux.intel.com> <20160429200709.GA15412@test-lenovo> <5723C6A7.4020704@linux.intel.com> <20160430075343.GA23063@gmail.com> Cc: Andy Lutomirski , Yu-cheng Yu , X86 ML , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , "linux-kernel@vger.kernel.org" , Andy Lutomirski , Borislav Petkov , Sai Praneeth Prakhya , "Ravi V. Shankar" , Fenghua Yu From: Dave Hansen Message-ID: <57278014.3050808@linux.intel.com> Date: Mon, 2 May 2016 09:28:04 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160430075343.GA23063@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/30/2016 12:53 AM, Ingo Molnar wrote: > We can still use the compacted area handling instructions, because presumably > those are the fastest and are also the most optimized ones? But I wouldn't use > them to do dynamic allocation: just allocate the maximum possible FPU save area at > task creation time and never again worry about that detail. > > Ok? Sounds sane to me. BTW, I hacked up your "fpu performance" to compare XSAVE vs. XSAVES: > [ 0.048347] x86/fpu: Cost of: XSAVE insn : 127 cycles > [ 0.049134] x86/fpu: Cost of: XSAVES insn : 113 cycles > [ 0.048492] x86/fpu: Cost of: XRSTOR insn : 120 cycles > [ 0.049267] x86/fpu: Cost of: XRSTORS insn : 102 cycles So I guess we can add that to the list of things that XSAVES is good for. Granted, the real-world benefit is probably hard to measure because the cache residency of the XSAVE buffer isn't as good when _actually_ context switching, but this at least shows a small theoretical advantage for XSAVES.