From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752102AbaE3EhG (ORCPT ); Fri, 30 May 2014 00:37:06 -0400 Received: from mail-vc0-f174.google.com ([209.85.220.174]:60146 "EHLO mail-vc0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751082AbaE3EhD (ORCPT ); Fri, 30 May 2014 00:37:03 -0400 MIME-Version: 1.0 In-Reply-To: <20140530021247.GR10092@bbox> References: <1401260039-18189-1-git-send-email-minchan@kernel.org> <1401260039-18189-2-git-send-email-minchan@kernel.org> <20140528223142.GO8554@dastard> <20140529013007.GF6677@dastard> <20140529015830.GG6677@dastard> <20140529233638.GJ10092@bbox> <20140530001558.GB14410@dastard> <20140530021247.GR10092@bbox> Date: Thu, 29 May 2014 21:37:02 -0700 X-Google-Sender-Auth: sOsLsuvg4IakFOCrsIThQ5he2EI Message-ID: Subject: Re: [RFC 2/2] x86_64: expand kernel stack to 16K From: Linus Torvalds To: Minchan Kim Cc: Dave Chinner , Linux Kernel Mailing List , Andrew Morton , linux-mm , "H. Peter Anvin" , Ingo Molnar , Peter Zijlstra , Mel Gorman , Rik van Riel , Johannes Weiner , Hugh Dickins , Rusty Russell , "Michael S. Tsirkin" , Dave Hansen , Steven Rostedt Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 29, 2014 at 7:12 PM, Minchan Kim wrote: > > Interim report, > > And result is as follows, It reduce about 800-byte compared to > my first report but still stack usage seems to be high. > Really needs diet of VM functions. Yes. And in this case uninlining things might actually help, because the it's not actually performing reclaim in the second case, so inlining the reclaim code into that huge __alloc_pages_nodemask() function means that it has the stack frame for all those cases even if they don't actually get used. That said, the way those functions are set up (with lots of arguments passed from one to the other), not inlining will cause huge costs too for the argument setup. It really might be very good to create a "struct alloc_info" that contains those shared arguments, and just pass a (const) pointer to that around. Gcc would likely tend to be *much* better at generating code for that, because it avoids a tons of temporaries being created by function calls. Even when it's inlined, the argument itself ends up being a new temporary internally, and I suspect one reason gcc (especially your 4.6.3 version, apparently) generates those big spill frames is because there's tons of these duplicate temporaries that apparently don't get merged properly. Ugh. I think I'll try looking at that tomorrow. Linus