From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [PATCH v2 06/13] fork: Add generic vmalloced stack support Date: Tue, 21 Jun 2016 10:46:49 +0200 Message-ID: <20160621084649.GC30848@dhcp22.suse.cz> References: <44f658aacbabd9d1689b3e0aae60ee8746881eff.1466192946.git.luto@kernel.org> <20160620133614.GE9892@dhcp22.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from mail-lb0-f179.google.com ([209.85.217.179]:35557 "EHLO mail-lb0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751485AbcFUIqw (ORCPT ); Tue, 21 Jun 2016 04:46:52 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-arch-owner@vger.kernel.org List-ID: To: Andy Lutomirski Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , linux-arch , Borislav Petkov , Nadav Amit , Kees Cook , Brian Gerst , "kernel-hardening@lists.openwall.com" , Linus Torvalds , Josh Poimboeuf , Jann Horn , Heiko Carstens On Mon 20-06-16 09:13:55, Andy Lutomirski wrote: > On Mon, Jun 20, 2016 at 6:36 AM, Michal Hocko wro= te: > > On Fri 17-06-16 13:00:42, Andy Lutomirski wrote: > >> If CONFIG_VMAP_STACK is selected, kernel stacks are allocated with > >> vmalloc_node. > > > > I like this! It also reduces demand for higher order (order-2) page= s > > considerably which is a great plus on its own. I would be little bi= t > > worried about the performance because vmalloc wasn't the fastest on= e > > AFAIR. Have you tried to measure that? >=20 > It seems to add about 1.5=B5s to pthread_create+join on my laptop. (= On > an unmodified, stripped-down kernel, it took about 7=B5s before. On = a > Fedora system, the baseline is much worse.) I think that most of the > overhead is because vmalloc allocates one page at a time, which means > that it won't use a higher order page even if one is sitting on a > freelist. I guess a less artificial test case which would would generate a lot of tasks and some memory pressure would be more representative (e.g. kernbench). The thing is that even order-2 pages might get quite expensive when the memory is fragmented. > I can imagine better integration with the page allocator in which > higher order pages are used if readily available. Similarly, vfree > could free pages that happen to be aligned and consecutive as a unit > to avoid the overhead of merging them back together one at a time. >=20 > But I'm not planning on doing any of this myself any time soon. I > just want to get the code working and merged. I agree, there is a room for improvement but no necessarily as a part o= f this series. --=20 Michal Hocko SUSE Labs