From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756826AbbGQIYI (ORCPT ); Fri, 17 Jul 2015 04:24:08 -0400 Received: from mail-wg0-f43.google.com ([74.125.82.43]:34682 "EHLO mail-wg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755463AbbGQIYF (ORCPT ); Fri, 17 Jul 2015 04:24:05 -0400 Date: Fri, 17 Jul 2015 10:23:59 +0200 From: Ingo Molnar To: Dave Hansen Cc: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, peterz@infradead.org, bp@alien8.de, luto@amacapital.net, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [RFC][PATCH] x86, fpu: dynamically allocate 'struct fpu' Message-ID: <20150717082359.GA13442@gmail.com> References: <20150716191437.A334FF2E@viggo.jf.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150716191437.A334FF2E@viggo.jf.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Dave Hansen wrote: > The FPU rewrite removed the dynamic allocations of 'struct fpu'. > But, this potentially wastes massive amounts of memory (2k per > task on systems that do not have AVX-512 for instance). > > Instead of having a separate slab, this patch just appends the > space that we need to the 'task_struct' which we dynamically > allocate already. This saves from doing an extra slab allocation > at fork(). The only real downside here is that we have to stick > everything and the end of the task_struct. But, I think the > BUILD_BUG_ON()s I stuck in there should keep that from being too > fragile. > > This survives a quick build and boot in a VM. Does anyone see any > real downsides to this? So considering the complexity of the other patch that makes the static allocation, I'd massively prefer this patch as it solves the real bug. It should also work on future hardware a lot better. This was the dynamic approach I suggested in our discussion of the big FPU code rework. > --- a/arch/x86/kernel/fpu/init.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.355571648 -0700 > +++ b/arch/x86/kernel/fpu/init.c 2015-07-16 12:02:15.284280976 -0700 > @@ -136,6 +136,45 @@ static void __init fpu__init_system_gene > unsigned int xstate_size; > EXPORT_SYMBOL_GPL(xstate_size); > > +#define CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) \ > + BUILD_BUG_ON((sizeof(TYPE) - \ > + offsetof(TYPE, MEMBER) - \ > + sizeof(((TYPE *)0)->MEMBER)) > \ > + 0) \ > + > +/* > + * We append the 'struct fpu' to the task_struct. > + */ > +int __weak arch_task_struct_size(void) This should not be __weak, otherwise we risk getting the generic version: > --- a/kernel/fork.c~dynamically-allocate-struct-fpu 2015-07-16 10:50:42.357571739 -0700 > +++ b/kernel/fork.c 2015-07-16 11:25:53.873498188 -0700 > @@ -287,15 +287,21 @@ static void set_max_threads(unsigned int > max_threads = clamp_t(u64, threads, MIN_THREADS, MAX_THREADS); > } > > +int __weak arch_task_struct_size(void) > +{ > + return sizeof(struct task_struct); > +} > + Your system probably worked due to link order preferring the x86 version but I'm not sure. Other than this bug it looks good to me in principle. Lemme check it on various hardware. Thanks, Ingo