All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Kees Cook <keescook@chromium.org>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Andrew Vagin <avagin@openvz.org>,
	Serge Hallyn <serge.hallyn@canonical.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Vasiliy Kulikov <segoon@openwall.com>
Subject: Re: [rfc 3/3] prctl: Add PR_SET_MM codes to tune up mm_struct entires
Date: Wed, 30 Nov 2011 21:37:39 +0400	[thread overview]
Message-ID: <20111130173739.GI14515@moon> (raw)
In-Reply-To: <CAGXu5jKDLExXWi+_Hjc+8s7FeuDgQdZcPipMH80LQs+tRyJm2A@mail.gmail.com>

On Tue, Nov 29, 2011 at 12:40:57PM -0800, Kees Cook wrote:
> >
> > On the other hands these fields are set up by elf hanlder code, which
> > does mmap these areas, so we have to check that particular member
> > belongs to existing VMA and never cross user-space area, and together
> > with root-only approach would not it be enough? I'm sure missing something
> > that is why I'm asking.
> 
> Right, if you verify that the addresses are actually inside valid
> userspace vmas, that is likely to be right, though there are probably
> other things I haven't thought of. The trouble is avoiding vdso, stack
> guard page, vsyscall, and anything else that isn't meant for the mm to
> have direct access to.
> 

Hi Kees,

what about this one? Note that these mm_struct members don't affect
kernel much (at least as far as I see, except maybe brk,start_brk and
start_stack values), so I've added some sanity checks here, hope they
would fit. Still main protection is root-only access only. The kernel
itself uses vma_area::start/end members for overlows tests internally
so I think even passing crazy data here won't crash the kernel itself.
What do you think?

	Cyrill
---
prctl: Add PR_SET_MM codes to tune up mm_struct entires v2

A few members of mm_struct such as start_code, end_code,
start_data, end_data, start_stack, start_brk, brk provided
by the kernel via /proc/$pid/stat and we use it at checkpoint
time.

At restore time we need a mechanism to restore those values
back and for this sake PR_SET_MM prctl code is introduced.

Note because of being a dangerous operation this inteface
is allowed for CAP_SYS_ADMIN only.

v2:
 - Add a check for vma start address, testing for vma ending
   address is not enough. From Kees Cook.

 - Add some sanity tests for assigned addresses.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Kees Cook <keescook@chromium.org>
---
 include/linux/prctl.h |   12 +++++
 kernel/sys.c          |  118 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 130 insertions(+)

Index: linux-2.6.git/include/linux/prctl.h
===================================================================
--- linux-2.6.git.orig/include/linux/prctl.h
+++ linux-2.6.git/include/linux/prctl.h
@@ -102,4 +102,16 @@
 
 #define PR_MCE_KILL_GET 34
 
+/*
+ * Tune up process memory map specifics.
+ */
+#define PR_SET_MM		35
+# define PR_SET_MM_START_CODE		1
+# define PR_SET_MM_END_CODE		2
+# define PR_SET_MM_START_DATA		3
+# define PR_SET_MM_END_DATA		4
+# define PR_SET_MM_START_STACK		5
+# define PR_SET_MM_START_BRK		6
+# define PR_SET_MM_BRK			7
+
 #endif /* _LINUX_PRCTL_H */
Index: linux-2.6.git/kernel/sys.c
===================================================================
--- linux-2.6.git.orig/kernel/sys.c
+++ linux-2.6.git/kernel/sys.c
@@ -1692,6 +1692,118 @@ SYSCALL_DEFINE1(umask, int, mask)
 	return mask;
 }
 
+static int prctl_set_mm(int opt, unsigned long addr)
+{
+	unsigned long rlim = rlimit(RLIMIT_DATA);
+	unsigned long vm_req_flags;
+	unsigned long vm_bad_flags;
+	struct vm_area_struct *vma;
+	struct mm_struct *mm;
+	int error = 0;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (addr >= TASK_SIZE)
+		return -EINVAL;
+
+	mm = get_task_mm(current);
+	if (!mm)
+		return -ENOENT;
+
+	down_read(&mm->mmap_sem);
+	vma = find_vma(mm, addr);
+
+	if (opt != PR_SET_MM_START_BRK &&
+	    opt != PR_SET_MM_BRK) {
+		/* It must be existing VMA */
+		if (!vma || vma->vm_start > addr)
+			goto out;
+	}
+
+	error = -EINVAL;
+	switch (opt) {
+	case PR_SET_MM_START_CODE:
+	case PR_SET_MM_END_CODE:
+
+		vm_req_flags = VM_READ | VM_EXEC;
+		vm_bad_flags = VM_WRITE | VM_MAYSHARE;
+
+		if ((vma->vm_flags & vm_req_flags) != vm_req_flags ||
+		    (vma->vm_flags & vm_bad_flags))
+			goto out;
+
+		if (opt == PR_SET_MM_START_CODE)
+			current->mm->start_code = addr;
+		else
+			current->mm->end_code = addr;
+		break;
+
+	case PR_SET_MM_START_DATA:
+	case PR_SET_MM_END_DATA:
+
+		vm_req_flags = VM_READ | VM_WRITE;
+		vm_bad_flags = VM_EXEC | VM_MAYSHARE;
+
+		if ((vma->vm_flags & vm_req_flags) != vm_req_flags ||
+		    (vma->vm_flags & vm_bad_flags))
+			goto out;
+
+		if (opt == PR_SET_MM_START_DATA)
+			current->mm->start_data = addr;
+		else
+			current->mm->end_data = addr;
+		break;
+
+	case PR_SET_MM_START_STACK:
+
+#ifdef CONFIG_STACK_GROWSUP
+		vm_req_flags = VM_READ | VM_WRITE | VM_GROWSUP;
+#else
+		vm_req_flags = VM_READ | VM_WRITE | VM_GROWSDOWN;
+#endif
+		if ((vma->vm_flags & vm_req_flags) != vm_req_flags)
+			goto out;
+
+		current->mm->start_stack = addr;
+		break;
+
+	case PR_SET_MM_START_BRK:
+		if (addr <= mm->end_data)
+			goto out;
+
+		if (rlim < RLIM_INFINITY &&
+		    (mm->brk - addr) + (mm->end_data - mm->start_data) > rlim)
+			goto out;
+
+		current->mm->start_brk = addr;
+		break;
+
+	case PR_SET_MM_BRK:
+		if (addr <= mm->end_data)
+			goto out;
+
+		if (rlim < RLIM_INFINITY &&
+		    (addr - mm->start_brk) + (mm->end_data - mm->start_data) > rlim)
+			goto out;
+
+		current->mm->brk = addr;
+		break;
+
+	default:
+		error = -EINVAL;
+		goto out;
+	}
+
+	error = 0;
+
+out:
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+
+	return error;
+}
+
 SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 		unsigned long, arg4, unsigned long, arg5)
 {
@@ -1841,6 +1953,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
 			else
 				error = PR_MCE_KILL_DEFAULT;
 			break;
+		case PR_SET_MM: {
+			if (arg4 | arg5)
+				return -EINVAL;
+			error = prctl_set_mm(arg2, arg3);
+			break;
+		}
 		default:
 			error = -EINVAL;
 			break;

  parent reply	other threads:[~2011-11-30 17:37 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-29 19:12 [rfc 0/3] A small bundle in a sake of checkpoint/restore Cyrill Gorcunov
2011-11-29 19:12 ` [rfc 1/3] fs, proc: Add start_data, end_data, start_brk members to /proc/$pid/stat Cyrill Gorcunov
2011-11-29 20:06   ` Kees Cook
2011-12-02  0:24     ` Alexey Dobriyan
2011-12-02  7:28       ` Cyrill Gorcunov
2011-12-02 19:23         ` Kees Cook
2011-12-02 19:28           ` Cyrill Gorcunov
2011-11-29 20:32   ` Serge Hallyn
2011-11-30  5:04   ` KAMEZAWA Hiroyuki
2011-11-29 19:12 ` [rfc 2/3] fs, proc: Introduce the Children: line in /proc/<pid>/status Cyrill Gorcunov
2011-11-30  5:00   ` KAMEZAWA Hiroyuki
2011-11-30  6:05     ` Cyrill Gorcunov
2011-12-01  9:54       ` Cyrill Gorcunov
2011-12-01 15:43         ` Tejun Heo
2011-12-01 15:53           ` Cyrill Gorcunov
2011-12-01 16:07             ` Tejun Heo
2011-12-01 21:29         ` Andrew Morton
2011-12-01 21:38           ` Cyrill Gorcunov
2011-12-02  0:40         ` KAMEZAWA Hiroyuki
2011-12-02 12:41           ` Pedro Alves
2011-12-02 12:43             ` Pavel Emelyanov
2011-12-02 12:45               ` Cyrill Gorcunov
2011-12-02 13:10                 ` Pedro Alves
2011-12-02 13:40                   ` Pedro Alves
2011-12-02 12:58               ` Pedro Alves
2011-12-02 13:16                 ` Pavel Emelyanov
2011-12-02 13:44                   ` Pedro Alves
2011-12-02 13:52                     ` Pavel Emelyanov
2011-12-02 14:00                       ` Pedro Alves
2011-12-02 14:17                         ` Pavel Emelyanov
2011-12-02 14:25                           ` Pedro Alves
2011-12-02 14:37                             ` Pavel Emelyanov
2011-12-02 14:45                               ` Pedro Alves
2011-11-29 19:12 ` [rfc 3/3] prctl: Add PR_SET_MM codes to tune up mm_struct entires Cyrill Gorcunov
2011-11-29 20:19   ` Kees Cook
2011-11-29 20:29     ` Cyrill Gorcunov
2011-11-29 20:37       ` Cyrill Gorcunov
2011-11-29 20:40         ` Kees Cook
2011-11-29 20:47           ` Cyrill Gorcunov
2011-11-30 17:37           ` Cyrill Gorcunov [this message]
2011-11-30 18:10             ` Kees Cook
2011-11-30 18:23               ` Cyrill Gorcunov
2011-11-30 21:06                 ` Cyrill Gorcunov
2011-12-07 12:27                   ` Cyrill Gorcunov
2011-12-07 22:43                     ` Andrew Morton
2011-12-08  7:07                       ` Cyrill Gorcunov
2011-12-08  7:15                         ` Andrew Morton
2011-12-08  7:30                           ` Cyrill Gorcunov
2011-11-29 20:37       ` Kees Cook
2011-11-29 20:49       ` Serge Hallyn
2011-11-29 20:55         ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111130173739.GI14515@moon \
    --to=gorcunov@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@openvz.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=segoon@openwall.com \
    --cc=serge.hallyn@canonical.com \
    --cc=tj@kernel.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.