From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753529AbdF2PTK (ORCPT ); Thu, 29 Jun 2017 11:19:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43706 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753291AbdF2PTE (ORCPT ); Thu, 29 Jun 2017 11:19:04 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 35536164C60 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=oleg@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 35536164C60 Date: Thu, 29 Jun 2017 17:19:01 +0200 From: Oleg Nesterov To: Linus Torvalds Cc: Hugh Dickins , Andrew Morton , Larry Woodman , Michal Hocko , Linux Kernel Mailing List Subject: Re: [PATCH 0/1] expand_downwards: don't require the gap if !vm_prev Message-ID: <20170629151901.GA32134@redhat.com> References: <20170628175237.GA24868@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 29 Jun 2017 15:19:04 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/28, Linus Torvalds wrote: > > On Wed, Jun 28, 2017 at 10:52 AM, Oleg Nesterov wrote: > > > > Now that the stack-guard-page has gone, why do we need to allow to grow > > into the previous VM_GROWSDOWN vma? IOW, why we can not simply remove > > the VM_GROWSDOWN check in expand_downwards() ? > > Because the "prev" vma may actually be the original vma. > > I think I described it in an earlier thread, but what happened at > least once was: > > - program has some part that uses a lot of stack for part of the > execution for some temp buffer or deep recursion or whatever > > - somebody noticed this, and decided to free up the no-longer-used > pages by doing a "munmap()" after the program was done with that part > of the stack > > - but the "munmap()" wasn't complete (maybe it only accounted for the > explicitly used buffer, whatever), so the munmap actually didn't just > remove the no-longer used bottom of the stack, it actually split the > stack segment into two (with a small remaining stack turd that was the > *real* bottom of the deep stack that used to exist) Ah, OK, thanks... > As to your patch: I would prefer to actually keep the new failure > behavior of unconditionally breaking a big stack expansion), unless > there's an actual thing it breaks. Hmm. May be you misread this patch? Or I misunderstood. > In fact, I'd even be quite open to adding a kernel warning about badly > behaved binaries that grow their stack by a big amount in one go. Yes, but this is another story. Currently expand_downwards(address) does if (address < stack_guard_gap) return -ENOMEM; This has nothing to do with "by how much it needs to grow", this simply forbids the bottom of stack below stack_guard_gap. Why? I don't think this patch can make any difference in practice, it just tries to make this logic more consistent/understandable. For example. Suppose that stack_guard_gap = 1M (default). Now, addr = 512K; // any addr <= stack_guard_gap; char *stack = mmap(addr, MAP_FIXED|MAP_GROWSDOWN, PAGE_SIZE); *stack = 0; stack -= PAGE_SIZE; *stack = 0; The first store will always succeed, the 2nd one will always fail even if (likely) there is no another vma below. This looks strange to me. Oleg.