From: Benjamin Berg <benjamin@sipsolutions.net>
To: Tiwei Bie <tiwei.btw@antgroup.com>, linux-um@lists.infradead.org
Subject: Re: [PATCH 3/5] um: Do a double clone to disable rseq
Date: Tue, 28 May 2024 12:30:31 +0200 [thread overview]
Message-ID: <388619aff5c8372998992a9f28fd7c71fb36eb51.camel@sipsolutions.net> (raw)
In-Reply-To: <5f5505da-ba67-421e-b0f5-6a5c19955f27@antgroup.com>
Hi Tiwei,
On Tue, 2024-05-28 at 18:16 +0800, Tiwei Bie wrote:
> On 5/28/24 4:54 PM, benjamin@sipsolutions.net wrote:
> > From: Benjamin Berg <benjamin.berg@intel.com>
> >
> > Newer glibc versions are enabling rseq support by default. This remains
> > enabled in the cloned child process, potentially causing the host kernel
> > to write/read memory in the child.
> >
> > It appears that this was purely not an issue because the used memory
> > area happened to be above TASK_SIZE and remains mapped.
>
> I also encountered this issue. In my case, with "Force a static link"
> (CONFIG_STATIC_LINK) enabled, UML will crash immediately every time
> it starts up. I worked around this by setting the glibc.pthread.rseq
> tunable via GLIBC_TUNABLES [1] before launching UML.
>
> So another easy way to work around this issue without introducing runtime
> overhead might be to add the GLIBC_TUNABLES=glibc.pthread.rseq=0 environment
> variable and exec /proc/self/exe in UML on startup.
I am not really worried about the overhead, but I agree that setting
GLIBC_TUNABLES is also a reasonable solution to the problem.
Doing the memfd/execveat dance with an embedded static binary would
still be best in my view, but either this or GLIBC_TUNABLES seem fine
in the meantime.
Do you want to submit the patch? Should I re-roll the patchset with
GLIBC_TUNABLES?
Benjamin
> [1] https://www.gnu.org/software/libc/manual/html_node/Tunables.html
>
> Regards,
> Tiwei
>
> >
> > Note that a better approach would be to exec a small static binary that
> > does not link with other libraries. Using a memfd and execveat the
> > binary could be embedded into UML itself and it would result in an
> > entirely clean execution environment for userspace.
> >
> > Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
> > ---
> > arch/um/os-Linux/skas/process.c | 54 ++++++++++++++++++++++++++++++---
> > 1 file changed, 50 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux/skas/process.c
> > index 41a288dcfc34..ee332a2aeea6 100644
> > --- a/arch/um/os-Linux/skas/process.c
> > +++ b/arch/um/os-Linux/skas/process.c
> > @@ -255,6 +255,31 @@ static int userspace_tramp(void *stack)
> > int userspace_pid[NR_CPUS];
> > int kill_userspace_mm[NR_CPUS];
> >
> > +struct tramp_data {
> > + int pid;
> > + void *clone_sp;
> > + void *stack;
> > +};
> > +
> > +static int userspace_tramp_clone_vm(void *data)
> > +{
> > + struct tramp_data *tramp_data = data;
> > +
> > + /*
> > + * This helper exist to do a double-clone. First with CLONE_VM which
> > + * effectively disables things like rseq, and then the second one to
> > + * get a new memory space.
> > + */
> > +
> > + tramp_data->pid = clone(userspace_tramp, tramp_data->clone_sp,
> > + CLONE_PARENT | CLONE_FILES | SIGCHLD,
> > + tramp_data->stack);
> > + if (tramp_data->pid < 0)
> > + tramp_data->pid = -errno;
> > +
> > + exit(0);
> > +}
> > +
> > /**
> > * start_userspace() - prepare a new userspace process
> > * @stub_stack: pointer to the stub stack.
> > @@ -268,9 +293,10 @@ int kill_userspace_mm[NR_CPUS];
> > */
> > int start_userspace(unsigned long stub_stack)
> > {
> > + struct tramp_data tramp_data;
> > void *stack;
> > unsigned long sp;
> > - int pid, status, n, flags, err;
> > + int pid, status, n, err;
> >
> > /* setup a temporary stack page */
> > stack = mmap(NULL, UM_KERN_PAGE_SIZE,
> > @@ -286,10 +312,13 @@ int start_userspace(unsigned long stub_stack)
> > /* set stack pointer to the end of the stack page, so it can grow downwards */
> > sp = (unsigned long)stack + UM_KERN_PAGE_SIZE;
> >
> > - flags = CLONE_FILES | SIGCHLD;
> > + tramp_data.stack = (void *) stub_stack;
> > + tramp_data.clone_sp = (void *) sp;
> > + tramp_data.pid = -EINVAL;
> >
> > /* clone into new userspace process */
> > - pid = clone(userspace_tramp, (void *) sp, flags, (void *) stub_stack);
> > + pid = clone(userspace_tramp_clone_vm, (void *) sp,
> > + CLONE_VM | CLONE_FILES | SIGCHLD, &tramp_data);
> > if (pid < 0) {
> > err = -errno;
> > printk(UM_KERN_ERR "%s : clone failed, errno = %d\n",
> > @@ -305,7 +334,24 @@ int start_userspace(unsigned long stub_stack)
> > __func__, errno);
> > goto out_kill;
> > }
> > - } while (WIFSTOPPED(status) && (WSTOPSIG(status) == SIGALRM));
> > + } while (!WIFEXITED(status));
> > +
> > + pid = tramp_data.pid;
> > + if (pid < 0) {
> > + printk(UM_KERN_ERR "%s : second clone failed, errno = %d\n",
> > + __func__, -pid);
> > + return pid;
> > + }
> > +
> > + do {
> > + CATCH_EINTR(n = waitpid(pid, &status, WUNTRACED | __WALL));
> > + if (n < 0) {
> > + err = -errno;
> > + printk(UM_KERN_ERR "%s : wait failed, errno = %d\n",
> > + __func__, errno);
> > + goto out_kill;
> > + }
> > + } while (WIFEXITED(status) && (WSTOPSIG(status) == SIGALRM));
> >
> > if (!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP)) {
> > err = -EINVAL;
>
>
next prev parent reply other threads:[~2024-05-28 10:30 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-28 8:54 [PATCH 0/5] Increased address space for 64 bit benjamin
2024-05-28 8:54 ` [PATCH 1/5] um: Fix stub_start address calculation benjamin
2024-05-28 8:54 ` [PATCH 2/5] um: Limit TASK_SIZE to the addressable range benjamin
2024-05-28 8:54 ` [PATCH 3/5] um: Do a double clone to disable rseq benjamin
2024-05-28 10:16 ` Tiwei Bie
2024-05-28 10:30 ` Benjamin Berg [this message]
2024-05-28 11:03 ` Tiwei Bie
2024-05-28 11:57 ` Johannes Berg
2024-05-28 14:13 ` Tiwei Bie
2024-05-30 2:54 ` Tiwei Bie
2024-05-30 8:54 ` Benjamin Berg
2024-05-30 14:05 ` Tiwei Bie
2024-05-28 8:54 ` [PATCH 4/5] um: Discover host_task_size from envp benjamin
2024-05-28 8:54 ` [PATCH 5/5] um: Add 4 level page table support benjamin
2024-05-30 3:07 ` Tiwei Bie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=388619aff5c8372998992a9f28fd7c71fb36eb51.camel@sipsolutions.net \
--to=benjamin@sipsolutions.net \
--cc=linux-um@lists.infradead.org \
--cc=tiwei.btw@antgroup.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox