linux-um.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: benjamin@sipsolutions.net
To: linux-um@lists.infradead.org
Cc: Benjamin Berg <benjamin.berg@intel.com>
Subject: [PATCH v2 3/5] um: Do a double clone to disable rseq
Date: Wed, 12 Jun 2024 15:48:02 +0200	[thread overview]
Message-ID: <20240612134804.1626427-4-benjamin@sipsolutions.net> (raw)
In-Reply-To: <20240612134804.1626427-1-benjamin@sipsolutions.net>

From: Benjamin Berg <benjamin.berg@intel.com>

Newer glibc versions are enabling rseq support by default. This remains
enabled in the cloned child process, potentially causing the host kernel
to write/read memory in the child.

It appears that this was purely not an issue because the used memory
area happened to be above TASK_SIZE and remains mapped.

Note that a better approach would be to exec a small static binary that
does not link with other libraries. Using a memfd and execveat the
binary could be embedded into UML itself and it would result in an
entirely clean execution environment for userspace.

Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>

---

v2: Improved clone logic using CLONE_VFORK
---
 arch/um/os-Linux/skas/process.c | 55 ++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 5 deletions(-)

diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux/skas/process.c
index 41a288dcfc34..11bc6f4ce5b3 100644
--- a/arch/um/os-Linux/skas/process.c
+++ b/arch/um/os-Linux/skas/process.c
@@ -255,6 +255,32 @@ static int userspace_tramp(void *stack)
 int userspace_pid[NR_CPUS];
 int kill_userspace_mm[NR_CPUS];
 
+struct tramp_data {
+	int pid;
+	void *clone_sp;
+	void *stack;
+};
+
+static int userspace_tramp_clone_vm(void *data)
+{
+	struct tramp_data *tramp_data = data;
+
+	/*
+	 * At this point we are still in the same VM as the parent, but rseq
+	 * has been disabled for this process.
+	 * Continue with the clone into the new userspace process, the kernel
+	 * continues as soon as this process quits (CLONE_VFORK).
+	 */
+
+	tramp_data->pid = clone(userspace_tramp, tramp_data->clone_sp,
+				CLONE_PARENT | CLONE_FILES | SIGCHLD,
+				tramp_data->stack);
+	if (tramp_data->pid < 0)
+		tramp_data->pid = -errno;
+
+	exit(0);
+}
+
 /**
  * start_userspace() - prepare a new userspace process
  * @stub_stack:	pointer to the stub stack.
@@ -268,9 +294,10 @@ int kill_userspace_mm[NR_CPUS];
  */
 int start_userspace(unsigned long stub_stack)
 {
+	struct tramp_data tramp_data;
 	void *stack;
 	unsigned long sp;
-	int pid, status, n, flags, err;
+	int pid, status, n, err;
 
 	/* setup a temporary stack page */
 	stack = mmap(NULL, UM_KERN_PAGE_SIZE,
@@ -286,10 +313,13 @@ int start_userspace(unsigned long stub_stack)
 	/* set stack pointer to the end of the stack page, so it can grow downwards */
 	sp = (unsigned long)stack + UM_KERN_PAGE_SIZE;
 
-	flags = CLONE_FILES | SIGCHLD;
+	tramp_data.stack = (void *) stub_stack;
+	tramp_data.clone_sp = (void *) sp;
+	tramp_data.pid = -EINVAL;
 
-	/* clone into new userspace process */
-	pid = clone(userspace_tramp, (void *) sp, flags, (void *) stub_stack);
+	/* first stage CLONE_VM clone using VFORK and no signal notification */
+	pid = clone(userspace_tramp_clone_vm, (void *) sp,
+		    CLONE_VM | CLONE_FILES | CLONE_VFORK, &tramp_data);
 	if (pid < 0) {
 		err = -errno;
 		printk(UM_KERN_ERR "%s : clone failed, errno = %d\n",
@@ -297,6 +327,21 @@ int start_userspace(unsigned long stub_stack)
 		return err;
 	}
 
+	n = waitpid(pid, &status, WUNTRACED | WNOHANG | __WCLONE);
+	if (n < 0 || !WIFEXITED(status) || WEXITSTATUS(status)) {
+		err = -errno;
+		printk(UM_KERN_ERR "%s : wait failed, errno = %d, status = %d\n",
+		       __func__, n < 0 ? errno : 0, status);
+		goto out_kill;
+	}
+
+	pid = tramp_data.pid;
+	if (pid < 0) {
+		printk(UM_KERN_ERR "%s : second clone failed, errno = %d\n",
+		       __func__, -pid);
+		return pid;
+	}
+
 	do {
 		CATCH_EINTR(n = waitpid(pid, &status, WUNTRACED | __WALL));
 		if (n < 0) {
@@ -305,7 +350,7 @@ int start_userspace(unsigned long stub_stack)
 			       __func__, errno);
 			goto out_kill;
 		}
-	} while (WIFSTOPPED(status) && (WSTOPSIG(status) == SIGALRM));
+	} while (WIFEXITED(status) && (WSTOPSIG(status) == SIGALRM));
 
 	if (!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP)) {
 		err = -EINVAL;
-- 
2.45.1



  parent reply	other threads:[~2024-06-12 13:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-12 13:47 [PATCH v2 0/5] Increased address space for 64 bit benjamin
2024-06-12 13:48 ` [PATCH v2 1/5] um: Fix stub_start address calculation benjamin
2024-06-12 13:48 ` [PATCH v2 2/5] um: Limit TASK_SIZE to the addressable range benjamin
2024-06-12 13:48 ` benjamin [this message]
2024-06-12 13:48 ` [PATCH v2 4/5] um: Discover host_task_size from envp benjamin
2024-06-12 13:48 ` [PATCH v2 5/5] um: Add 4 level page table support benjamin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240612134804.1626427-4-benjamin@sipsolutions.net \
    --to=benjamin@sipsolutions.net \
    --cc=benjamin.berg@intel.com \
    --cc=linux-um@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).