From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751912AbbHEOej (ORCPT <rfc822;w@1wt.eu>);
	Wed, 5 Aug 2015 10:34:39 -0400
Received: from www.sr71.net ([198.145.64.142]:40554 "EHLO blackbird.sr71.net"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751139AbbHEOei (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Wed, 5 Aug 2015 10:34:38 -0400
Message-ID: <55C21EFC.3060802@sr71.net>
Date: Wed, 05 Aug 2015 07:34:36 -0700
From: Dave Hansen <dave@sr71.net>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0
MIME-Version: 1.0
To: Ingo Molnar <mingo@kernel.org>
CC: dave.hansen@linux.intel.com, linux-kernel@vger.kernel.org, bp@alien8.de,
        fenghua.yu@intel.com, hpa@zytor.com, x86@kernel.org,
        Thomas Gleixner <tglx@linutronix.de>,
        Peter Zijlstra <a.p.zijlstra@chello.nl>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Andy Lutomirski <luto@kernel.org>,
        Denys Vlasenko <dvlasenk@redhat.com>
Subject: Re: [PATCH] x86, fpu: correct XSAVE xstate size calculation
References: <20150728172143.6DDFECA7@viggo.jf.intel.com> <20150805103227.GA3233@gmail.com>
In-Reply-To: <20150805103227.GA3233@gmail.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 08/05/2015 03:32 AM, Ingo Molnar wrote:
> * Dave Hansen <dave@sr71.net> wrote:
>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>
>> Note: our xsaves support is currently broken and disabled.  This
>> patch does not fix it, but it is an incremental improvement.  It
>> might be useful to someone backporting the entire set of XSAVES
>> patches at some point, but it should not be backported alone.
>>
>> There are currently two xsave buffer formats: standard and
>> compacted.  The standard format is waht 'XSAVE' and 'XSAVEOPT'
>> produce while 'XSAVES' and 'XSAVEC' produce a compacted-formet
>> buffer.  (The kernel never uses XSAVEC)
>>
>> But, the XSAVES buffer *ALSO* contains "system state components"
>> which are never saved by a plain XSAVE.  So, XSAVES has two
>> things that might make its buffer differently-sized from an
>> XSAVE-produced one.
>>
>> The current code assumes that an XSAVES buffer's size is simply
>> the sum of the sizes of the (user) states which are supported.
>> This seems to work in most cases, but it is not consistent with
>> what the SDM says, and it breaks if we 'align' a component in the
>> buffer.  The calculation is also unnecessary work since the CPU
>> *tells* us the size of the buffer directly.
>>
>> This patch just reads the size of the buffer right out of the
>> CPUID leaf instead of trying to derive it.
> 
> So how will we know where to find which field, if we cannot even do a size 
> calculation?

setup_xstate_features() still populates xstate_offsets[] which tells us
where to find each field.  This patch does not change that.

> I realize that the calculation and what CPUID gives us should match, but it's not 
> really good for the kernel to not know the precise layout of a critical task 
> context data structure ...

There is no architectural guarantee that the sum of xstate sizes will be
the same as what comes out of that CPUID leaf.  It would be nice, but
it's not architectural and I've run in to platforms where that
assumption does not hold.

> So can we turn this into 'double check the CPUID size and print a warning on 
> mismatch' kind of boot time sanity check? Preferably for all XSAVE* data formats 
> we can run into. I'd be fine with applying such a patch ahead of enabling 
> compaction again.

I don't think that is sufficient.

There are 4 reasons to apply this patch that I can think of:
1. There is no architectural guarantee that the calculation (sum of
   xstate sizes) will match what CPUID gives us as the size of the
   buffer.  I've seen this in practice.
2. The alignment bit indicates that there is space used in the buffer
   which is not part of a state component.  The current code does not
   take that in to account.
3. The code is currently asking for the size of an XSAVE-produced
   buffer.  The code will be wrong the moment we switch to XSAVES
   because XSAVES saves more things than XSAVE and uses more space.
4. It makes the code smaller and simpler, especially if you consider
   what would happen if we added "real" alignment support.