From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751945AbbIOGLK (ORCPT <rfc822;w@1wt.eu>);
	Tue, 15 Sep 2015 02:11:10 -0400
Received: from mail-wi0-f169.google.com ([209.85.212.169]:33772 "EHLO
	mail-wi0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1750698AbbIOGLI (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 15 Sep 2015 02:11:08 -0400
Date: Tue, 15 Sep 2015 08:11:02 +0200
From: Ingo Molnar <mingo@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>, sedat.dilek@gmail.com,
        Peter Zijlstra <peterz@infradead.org>, Baoquan He <bhe@redhat.com>,
        Denys Vlasenko <dvlasenk@redhat.com>, Tejun Heo <tj@kernel.org>,
        LKML <linux-kernel@vger.kernel.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        David Rientjes <rientjes@google.com>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Thomas Gleixner <tglx@linutronix.de>, Thomas Graf <tgraf@suug.ch>,
        the arch/x86 maintainers <x86@kernel.org>
Subject: Re: [llvmlinux] percpu | bitmap issue? (Cannot boot on bare metal
 due to a kernel NULL pointer dereference)
Message-ID: <20150915061102.GA20229@gmail.com>
References: <20150909071410.GD1998@dhcp-17-102.nay.redhat.com>
 <CA+icZUVTHJGkg9mTs3TSy3U4VX_cTyNpwNeX9R+73Y9=ZD_E0A@mail.gmail.com>
 <CA+icZUWo1-7Kqo36Dp-pjjq1hUdiqvwhZhvaXM1RwA8mqter_A@mail.gmail.com>
 <20150909125424.GP3644@twins.programming.kicks-ass.net>
 <CA+icZUUJOsoon2Vk09yu6Z=p2N8mpHkVnp=TZyfSxpKKw_x2iQ@mail.gmail.com>
 <CA+icZUU=qAJ7xqYs_k=NNG-Sj3Xv770MUNToA4JqiO3rAzUxNA@mail.gmail.com>
 <20150914071231.GM18489@twins.programming.kicks-ass.net>
 <CA+icZUX=0qtftp+y1RdeyAZrZuiNMHKk=jhpWiB2sGqsJtcWuQ@mail.gmail.com>
 <55F708D3.9090007@gmail.com>
 <alpine.DEB.2.11.1509141325460.4192@east.gentwo.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <alpine.DEB.2.11.1509141325460.4192@east.gentwo.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


* Christoph Lameter <cl@linux.com> wrote:

> On Mon, 14 Sep 2015, Austin S Hemmelgarn wrote:
> 
> >I can comment at least a little about the -Os aspect (although not I'm no
> >expert on this in particular).  In general, for _most_ use cases, a kernel
> >compiled with CONFIG_CC_OPTIMIZE_FOR_SIZE will run slower than one compiled
> >without it.  On rare occasion though, it may actually run faster, the only
> >cases I've seen where this happens are specialized uses that are very memory
> >pressure dependent and run almost entirely in userspace with almost no
> >syscalls (for example math related stuff operating on _very, very big_ (as in,
> >>1 trillion elements) multidimensional matrices, with complex memory
> >constraints), and even then it's usually a miniscule improvement in
> >performance (generally less than 1%, which can of course be significant
> >depending on how long it takes before the improvement).
> 
> Cache footprint depends on size which has a significant impact on
> performance. In our experience the kernel (and any other code) is
> generally faster if optimized for size.

Unfortunately, GCC overdoes -Os generating outright silly code, which makes the 
result generally slower - despite the reduced instruction count and reduced cache 
footprint.

We've recently applied patches to the x86 tree that give us a good chunk of the 
size savings that -Os brings:

  52648e83c9a6 x86: Pack loops tightly as well
  be6cb02779ca x86: Align jump targets to 1-byte boundaries

these two shave about 5% off from the typical distro kernel's size. That's still 
way off the 15%-20% that -Os can muster, but another ~10% are possible by not 
aligning functions to byte boundaries (instead of the default 16 bytes).

So about 70% of the -Os size win is from simple and pure alignment relaxation, not 
from any deeper compiler optimizations.

So LLVM could emulate most of the good effects of -Os by only compressing the 
various alignment parameters - and this would be a pretty safe optimization as 
well.

Thanks,

	Ingo