From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1753529AbdF2PTK (ORCPT <rfc822;w@1wt.eu>);
        Thu, 29 Jun 2017 11:19:10 -0400
Received: from mx1.redhat.com ([209.132.183.28]:43706 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1753291AbdF2PTE (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 29 Jun 2017 11:19:04 -0400
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 35536164C60
Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=oleg@redhat.com
DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 35536164C60
Date: Thu, 29 Jun 2017 17:19:01 +0200
From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        Larry Woodman <lwoodman@redhat.com>, Michal Hocko <mhocko@suse.com>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/1] expand_downwards: don't require the gap if !vm_prev
Message-ID: <20170629151901.GA32134@redhat.com>
References: <20170628175237.GA24868@redhat.com>
 <CA+55aFz0bvToQBMn+5_1+_OFDd-yNm3Tkt1tFBojwDHXsrHSHQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CA+55aFz0bvToQBMn+5_1+_OFDd-yNm3Tkt1tFBojwDHXsrHSHQ@mail.gmail.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 29 Jun 2017 15:19:04 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 06/28, Linus Torvalds wrote:
>
> On Wed, Jun 28, 2017 at 10:52 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Now that the stack-guard-page has gone, why do we need to allow to grow
> > into the previous VM_GROWSDOWN vma? IOW, why we can not simply remove
> > the VM_GROWSDOWN check in expand_downwards() ?
>
> Because the "prev" vma may actually be the original vma.
>
> I think I described it in an earlier thread, but what happened at
> least once was:
>
>  - program has some part that uses a lot of stack for part of the
> execution for some temp buffer or deep recursion or whatever
>
>  - somebody noticed this, and decided to free up the no-longer-used
> pages by doing a "munmap()" after the program was done with that part
> of the stack
>
>  - but the "munmap()" wasn't complete (maybe it only accounted for the
> explicitly used buffer, whatever), so the munmap actually didn't just
> remove the no-longer used bottom of the stack, it actually split the
> stack segment into two (with a small remaining stack turd that was the
> *real* bottom of the deep stack that used to exist)

Ah, OK, thanks...


> As to your patch: I would prefer to actually keep the new failure
> behavior of unconditionally breaking a big stack expansion), unless
> there's an actual thing it breaks.

Hmm. May be you misread this patch? Or I misunderstood.

> In fact, I'd even be quite open to adding a kernel warning about badly
> behaved binaries that grow their stack by a big amount in one go.

Yes, but this is another story.

Currently expand_downwards(address) does

	if (address < stack_guard_gap)
		return -ENOMEM;

This has nothing to do with "by how much it needs to grow", this simply
forbids the bottom of stack below stack_guard_gap. Why?

I don't think this patch can make any difference in practice, it just
tries to make this logic more consistent/understandable.

For example. Suppose that stack_guard_gap = 1M (default). Now,

	addr = 512K;  // any addr <= stack_guard_gap;
	char *stack = mmap(addr, MAP_FIXED|MAP_GROWSDOWN, PAGE_SIZE);

	*stack = 0;
	stack -= PAGE_SIZE;
	*stack = 0;

The first store will always succeed, the 2nd one will always fail even
if (likely) there is no another vma below. This looks strange to me.

Oleg.