From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [PATCH v2 06/13] fork: Add generic vmalloced stack support Date: Mon, 20 Jun 2016 09:13:55 -0700 Message-ID: References: <44f658aacbabd9d1689b3e0aae60ee8746881eff.1466192946.git.luto@kernel.org> <20160620133614.GE9892@dhcp22.suse.cz> Reply-To: kernel-hardening@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: List-Post: List-Help: List-Unsubscribe: List-Subscribe: In-Reply-To: <20160620133614.GE9892@dhcp22.suse.cz> To: Michal Hocko Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , linux-arch , Borislav Petkov , Nadav Amit , Kees Cook , Brian Gerst , "kernel-hardening@lists.openwall.com" , Linus Torvalds , Josh Poimboeuf , Jann Horn , Heiko Carstens List-Id: linux-arch.vger.kernel.org On Mon, Jun 20, 2016 at 6:36 AM, Michal Hocko wrote: > On Fri 17-06-16 13:00:42, Andy Lutomirski wrote: >> If CONFIG_VMAP_STACK is selected, kernel stacks are allocated with >> vmalloc_node. > > I like this! It also reduces demand for higher order (order-2) pages > considerably which is a great plus on its own. I would be little bit > worried about the performance because vmalloc wasn't the fastest one > AFAIR. Have you tried to measure that? It seems to add about 1.5=C2=B5s to pthread_create+join on my laptop. (On an unmodified, stripped-down kernel, it took about 7=C2=B5s before. On a Fedora system, the baseline is much worse.) I think that most of the overhead is because vmalloc allocates one page at a time, which means that it won't use a higher order page even if one is sitting on a freelist. I can imagine better integration with the page allocator in which higher order pages are used if readily available. Similarly, vfree could free pages that happen to be aligned and consecutive as a unit to avoid the overhead of merging them back together one at a time. But I'm not planning on doing any of this myself any time soon. I just want to get the code working and merged. --Andy From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk0-f48.google.com ([209.85.213.48]:35055 "EHLO mail-vk0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754050AbcFTQjt convert rfc822-to-8bit (ORCPT ); Mon, 20 Jun 2016 12:39:49 -0400 Received: by mail-vk0-f48.google.com with SMTP id j2so202682693vkg.2 for ; Mon, 20 Jun 2016 09:38:56 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20160620133614.GE9892@dhcp22.suse.cz> References: <44f658aacbabd9d1689b3e0aae60ee8746881eff.1466192946.git.luto@kernel.org> <20160620133614.GE9892@dhcp22.suse.cz> From: Andy Lutomirski Date: Mon, 20 Jun 2016 09:13:55 -0700 Message-ID: Subject: Re: [PATCH v2 06/13] fork: Add generic vmalloced stack support Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-arch-owner@vger.kernel.org List-ID: To: Michal Hocko Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , linux-arch , Borislav Petkov , Nadav Amit , Kees Cook , Brian Gerst , "kernel-hardening@lists.openwall.com" , Linus Torvalds , Josh Poimboeuf , Jann Horn , Heiko Carstens Message-ID: <20160620161355.RcZ5EJ0XWvbpR7LJduf9EKXONUloB3iA4qT3mB5I668@z> On Mon, Jun 20, 2016 at 6:36 AM, Michal Hocko wrote: > On Fri 17-06-16 13:00:42, Andy Lutomirski wrote: >> If CONFIG_VMAP_STACK is selected, kernel stacks are allocated with >> vmalloc_node. > > I like this! It also reduces demand for higher order (order-2) pages > considerably which is a great plus on its own. I would be little bit > worried about the performance because vmalloc wasn't the fastest one > AFAIR. Have you tried to measure that? It seems to add about 1.5µs to pthread_create+join on my laptop. (On an unmodified, stripped-down kernel, it took about 7µs before. On a Fedora system, the baseline is much worse.) I think that most of the overhead is because vmalloc allocates one page at a time, which means that it won't use a higher order page even if one is sitting on a freelist. I can imagine better integration with the page allocator in which higher order pages are used if readily available. Similarly, vfree could free pages that happen to be aligned and consecutive as a unit to avoid the overhead of merging them back together one at a time. But I'm not planning on doing any of this myself any time soon. I just want to get the code working and merged. --Andy