From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751945AbbIOGLK (ORCPT ); Tue, 15 Sep 2015 02:11:10 -0400 Received: from mail-wi0-f169.google.com ([209.85.212.169]:33772 "EHLO mail-wi0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750698AbbIOGLI (ORCPT ); Tue, 15 Sep 2015 02:11:08 -0400 Date: Tue, 15 Sep 2015 08:11:02 +0200 From: Ingo Molnar To: Christoph Lameter Cc: Austin S Hemmelgarn , sedat.dilek@gmail.com, Peter Zijlstra , Baoquan He , Denys Vlasenko , Tejun Heo , LKML , Andrew Morton , David Rientjes , Linus Torvalds , Thomas Gleixner , Thomas Graf , the arch/x86 maintainers Subject: Re: [llvmlinux] percpu | bitmap issue? (Cannot boot on bare metal due to a kernel NULL pointer dereference) Message-ID: <20150915061102.GA20229@gmail.com> References: <20150909071410.GD1998@dhcp-17-102.nay.redhat.com> <20150909125424.GP3644@twins.programming.kicks-ass.net> <20150914071231.GM18489@twins.programming.kicks-ass.net> <55F708D3.9090007@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Christoph Lameter wrote: > On Mon, 14 Sep 2015, Austin S Hemmelgarn wrote: > > >I can comment at least a little about the -Os aspect (although not I'm no > >expert on this in particular). In general, for _most_ use cases, a kernel > >compiled with CONFIG_CC_OPTIMIZE_FOR_SIZE will run slower than one compiled > >without it. On rare occasion though, it may actually run faster, the only > >cases I've seen where this happens are specialized uses that are very memory > >pressure dependent and run almost entirely in userspace with almost no > >syscalls (for example math related stuff operating on _very, very big_ (as in, > >>1 trillion elements) multidimensional matrices, with complex memory > >constraints), and even then it's usually a miniscule improvement in > >performance (generally less than 1%, which can of course be significant > >depending on how long it takes before the improvement). > > Cache footprint depends on size which has a significant impact on > performance. In our experience the kernel (and any other code) is > generally faster if optimized for size. Unfortunately, GCC overdoes -Os generating outright silly code, which makes the result generally slower - despite the reduced instruction count and reduced cache footprint. We've recently applied patches to the x86 tree that give us a good chunk of the size savings that -Os brings: 52648e83c9a6 x86: Pack loops tightly as well be6cb02779ca x86: Align jump targets to 1-byte boundaries these two shave about 5% off from the typical distro kernel's size. That's still way off the 15%-20% that -Os can muster, but another ~10% are possible by not aligning functions to byte boundaries (instead of the default 16 bytes). So about 70% of the -Os size win is from simple and pure alignment relaxation, not from any deeper compiler optimizations. So LLVM could emulate most of the good effects of -Os by only compressing the various alignment parameters - and this would be a pretty safe optimization as well. Thanks, Ingo