All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Kees Cook <keescook@chromium.org>
Cc: Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.com>,
	Stanislav Kozina <skozina@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: get_arg_page() && ptr_size accounting
Date: Tue, 11 Sep 2018 16:13:18 +0200	[thread overview]
Message-ID: <20180911141318.GA30907@redhat.com> (raw)
In-Reply-To: <CAGXu5jLWP4iKwTV3tBcV+nyZAU566qkknhODZdY_7o4p0zWVzQ@mail.gmail.com>

On 09/10, Kees Cook wrote:
>
> I've looked more closely now. So, while I agree with you about
> resource limits, there's a corner case that is better handled here:
> once we've called flush_old_exec(), we can no longer send errors back
> to the parent. We just segfault. So, I think it's better to give a
> resource limit error early, since it is able to do the math early.
>
> If we move acct_arg_size() earlier, then the "immediate" resource
> utilization is checked, but it means it can just segfault later. If we
> leave it as-is, we account for later memory allocations "too early",
> but we'll still not be able to run: but we can tell the parent why.

I don't follow. Could you spell please?

AFAICS, the trivial patch I proposed changes nothing except it fixes the
bprm->pages accounting. The problem is really minor, but this looks confusing
and wrong anyway.

> I prefer leave it as-is.

After this discussion, I strongly disagree.

And now I think we should remove this rlim crap from get_arg_page() altogether
to make the things more clear.

> > Please forget. I meant that _if_ we actually wanted to account this additional
> > memory in bprm->pages, than we would probably need something like
> > acct_arg_size(size/PAGE_SIZE + DIV_ROUND_UP(ptr_size, PAGE_SIZE)).
>
> I'd need to study that more, but that change seems reasonable. :)

Please forget. Not that it matters, but we simply can't account ptr_size
100% correctly if we do this in get_arg_page().

See the patch below. Completely untested, quite possibly wrong, but I think
this is what we should do.

Oleg.


diff --git a/fs/exec.c b/fs/exec.c
index 1ebf6e5..7804a5c 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -218,55 +218,10 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 	if (ret <= 0)
 		return NULL;
 
-	if (write) {
-		unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
-		unsigned long ptr_size, limit;
-
-		/*
-		 * Since the stack will hold pointers to the strings, we
-		 * must account for them as well.
-		 *
-		 * The size calculation is the entire vma while each arg page is
-		 * built, so each time we get here it's calculating how far it
-		 * is currently (rather than each call being just the newly
-		 * added size from the arg page).  As a result, we need to
-		 * always add the entire size of the pointers, so that on the
-		 * last call to get_arg_page() we'll actually have the entire
-		 * correct size.
-		 */
-		ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
-		if (ptr_size > ULONG_MAX - size)
-			goto fail;
-		size += ptr_size;
-
-		acct_arg_size(bprm, size / PAGE_SIZE);
-
-		/*
-		 * We've historically supported up to 32 pages (ARG_MAX)
-		 * of argument strings even with small stacks
-		 */
-		if (size <= ARG_MAX)
-			return page;
-
-		/*
-		 * Limit to 1/4 of the max stack size or 3/4 of _STK_LIM
-		 * (whichever is smaller) for the argv+env strings.
-		 * This ensures that:
-		 *  - the remaining binfmt code will not run out of stack space,
-		 *  - the program will have a reasonable amount of stack left
-		 *    to work from.
-		 */
-		limit = _STK_LIM / 4 * 3;
-		limit = min(limit, bprm->rlim_stack.rlim_cur / 4);
-		if (size > limit)
-			goto fail;
-	}
+	if (write)
+		acct_arg_size(bprm, vma_pages(bprm->vma));
 
 	return page;
-
-fail:
-	put_page(page);
-	return NULL;
 }
 
 static void put_arg_page(struct page *page)
@@ -410,11 +365,6 @@ static int bprm_mm_init(struct linux_binprm *bprm)
 	if (!mm)
 		goto err;
 
-	/* Save current stack limit for all calculations made during exec. */
-	task_lock(current->group_leader);
-	bprm->rlim_stack = current->signal->rlim[RLIMIT_STACK];
-	task_unlock(current->group_leader);
-
 	err = __bprm_mm_init(bprm);
 	if (err)
 		goto err;
@@ -492,6 +442,27 @@ static int count(struct user_arg_ptr argv, int max)
 	return i;
 }
 
+static int prepare_rlim_stack(struct linux_binprm *bprm)
+{
+	unsigned long limit, ptr_size;
+
+	task_lock(current->group_leader);
+	bprm->rlim_stack = current->signal->rlim[RLIMIT_STACK];
+	task_unlock(current->group_leader);
+
+	limit = _STK_LIM / 4 * 3;
+	limit = min(limit, bprm->rlim_stack.rlim_cur / 4);
+	limit = max(limit, (unsigned long)ARG_MAX);
+	/* COMMENT */
+	ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+	if (limit <= ptr_size)
+		return -E2BIG;
+	limit -= ptr_size;
+
+	bprm->p_min = bprm->p - limit;
+	return 0;
+}
+
 /*
  * 'copy_strings()' copies argument/environment strings from the old
  * processes's memory to the new process's stack.  The call to get_user_pages()
@@ -527,6 +498,8 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
 		pos = bprm->p;
 		str += len;
 		bprm->p -= len;
+		if (bprm->p <= bprm->p_min)
+			goto out;
 
 		while (len > 0) {
 			int offset, bytes_to_copy;
@@ -1801,6 +1774,10 @@ static int __do_execve_file(int fd, struct filename *filename,
 	if (retval < 0)
 		goto out;
 
+	retval = prepare_rlim_stack(bprm);
+	if (retval < 0)
+		goto out;
+
 	retval = copy_strings_kernel(1, &bprm->filename, bprm);
 	if (retval < 0)
 		goto out;
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index c05f24f..423e8c1 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -24,7 +24,7 @@ struct linux_binprm {
 	struct page *page[MAX_ARG_PAGES];
 #endif
 	struct mm_struct *mm;
-	unsigned long p; /* current top of mem */
+	unsigned long p, p_min; /* current top of mem */
 	unsigned int
 		/*
 		 * True after the bprm_set_creds hook has been called once


  reply	other threads:[~2018-09-11 14:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-10 12:29 get_arg_page() && ptr_size accounting Oleg Nesterov
2018-09-10 16:41 ` Kees Cook
2018-09-10 16:45   ` Kees Cook
2018-09-10 17:21     ` Oleg Nesterov
2018-09-10 17:43       ` Oleg Nesterov
2018-09-11  4:30         ` Kees Cook
2018-09-11 15:29           ` Oleg Nesterov
2018-09-11  4:27       ` Kees Cook
2018-09-11 15:25         ` Oleg Nesterov
2018-09-10 17:18   ` Oleg Nesterov
2018-09-11  4:23     ` Kees Cook
2018-09-11 14:13       ` Oleg Nesterov [this message]
2018-09-11 19:06         ` Kees Cook
2018-09-12 12:27           ` Oleg Nesterov
2018-09-12 14:23             ` Oleg Nesterov
2018-09-12 20:42             ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180911141318.GA30907@redhat.com \
    --to=oleg@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=riel@redhat.com \
    --cc=skozina@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.