From: Louis Rilling <Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
To: Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [RFC][PATCH] clone_with_pids()^w eclone() for x86_64
Date: Thu, 19 Nov 2009 22:26:47 +0100 [thread overview]
Message-ID: <20091119212646.GA4767@localdomain> (raw)
In-Reply-To: <1258652929.20093.8941.camel@nimitz>
[-- Attachment #1.1: Type: text/plain, Size: 5720 bytes --]
On Thu, Nov 19, 2009 at 09:48:49AM -0800, Dave Hansen wrote:
> On Thu, 2009-11-19 at 10:58 +0100, Louis Rilling wrote:
> > > int clone_with_pids(long flags_low, struct clone_args *clone_args, long args_size,
> > > int *pids)
> > > {
> > > long retval;
> > >
> > > __asm__ __volatile__(
> > > "movq %3, %%r10\n\t" /* pids in r10*/
> > > "pushq %%rbp\n\t" /* save value of ebp */
> > > :
> > > :"D" (flags_low), /* rdi */
> > > "S" (clone_args),/* rsi */
> > > "d" (args_size), /* rdx */
> > > "a" (pids) /* use rax, which gets moved to r10 */
> > > );
> >
> > 1. The fourth C arg is not in rax, but in rcx.
>
> Hey Louis,
>
> So, try as I might, I couldn't get that to work. I thought it was rcx,
> too.
>
> So, changing that instruction to:
>
> "movq %3, %%rcx\n\t" /* pids in r10*/
Hm, no.
I meant (without taking into account my other comments):
__asm__ __volatile__(
"movq %3, %%r10\n\t" /* pids in r10*/
"pushq %%rbp\n\t" /* save value of ebp */
:
:"D" (flags_low), /* rdi */
"S" (clone_args),/* rsi */
"d" (args_size), /* rdx */
"c" (pids) /* use rcx, which gets moved to r10 */
);
But actually this is even better :D:
__asm__ __volatile__(
"movq %3, %%r10\n\t" /* pids in r10*/
"pushq %%rbp\n\t" /* save value of ebp */
:
:"D" (flags_low), /* rdi */
"S" (clone_args),/* rsi */
"d" (args_size), /* rdx */
"r10" (pids) /* Linux reads its fourth arg from r10 */
);
>
> and putting 0x11111, etc... in for the args the strace output for the
> syscall looks like this:
>
> syscall_299(0x11111, 0x22222, 0x33333, 0x1, 0x1, 0x2, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0) = -1 (errno 22)
>
> and I get -EFAULT back from the function doing the copy_from_user() of
> the pids argument, even when using good values.
>
> If I use the asm posted above, I get this:
>
> syscall_299(0x11111, 0x22222, 0x33333, 0x44444, 0x1, 0x2, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0) = -1 (errno 22)
>
> Or, this from a real call:
>
> syscall_299(0x1100011, 0x7fff19f0fd40, 0x38, 0x602070, 0x1, 0x2,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0[2992, 377]: Child:
>
> I had to find r10 basically by trial and error. I have no idea why it
> works.
r10 is used to pass the fourth arg to the kernel because the syscall instruction
puts next rip (return address) in rcx. Using r10 instead of rcx is defined as part
of Linux ABI for x86_64.
For all the details, read the comments in
arch/x86/kernel/entry_64.S:ENTRY(system_call).
>
> > >
> > > __asm__ __volatile__(
> > > "syscall\n\t" /* Linux/x86_64 system call */
> > > "testq %0,%0\n\t" /* check return value */
> > > "jne 1f\n\t" /* jump if parent */
> > > "popq %%rbx\n\t" /* get subthread function */
> > > "call *%%rbx\n\t" /* start subthread function */
> > > "movq %2,%0\n\t"
> > > "syscall\n" /* exit system call: exit subthread */
> > > "1:\n\t"
> > > "popq %%rbp\t" /* restore parent's ebp */
> > > :"=a" (retval)
> > > :"0" (__NR_clone3), "i" (__NR_exit)
> > > :"ebx", "ecx", "edx"
> > > );
> >
> > 2. You should probably not separate this into two asm statements. In particular,
> > the compiler has no way to know that r10 should be preserved between the two
> > statements, and may be confused by the change of rsp.
>
> Yeah, I wondered about that. Suka, we should probably fix your tests
> and the i386 code, too.
>
> > 3. r10 and r11 should be listed as clobbered.
>
> D'oh! I didn't even touch the bottom registers because it continued to
> work from the i386 version that I stole from Suka.
That's again because of the syscall instruction, which saves EFLAGS to r11
(and sysret restores EFLAGS from r11).
>
> > 4. I fail to see the magic that puts the subthread function pointer in the
> > stack.
> >
> > 5. Maybe rdi should contain the subthread argument before calling the subthread?
> >
> > 6. rdi, rsi, rdx, rcx, r8 and r9 should be added to the clobber list because of
> > the call to the subthread function.
> >
> > 7. rsi could be used in place of rbx to hold the function pointer, which would
> > allow you to remove ebx from the clobber list.
> >
> > 8. I don't see why rbp should be saved. The ABI says it must be saved by the
> > callee.
> >
> > 9. Before calling exit(), maybe put some exit code in rdi?
>
> Thanks for looking through this, Louis. I'll send out another version
> in a bit.
Thanks,
Louis
--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
[-- Attachment #2: Type: text/plain, Size: 206 bytes --]
_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers
next prev parent reply other threads:[~2009-11-19 21:26 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-19 0:48 [RFC][PATCH] clone_with_pids()^w eclone() for x86_64 Dave Hansen
2009-11-19 9:58 ` Louis Rilling
[not found] ` <20091119095844.GP4379-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-11-19 17:48 ` Dave Hansen
2009-11-19 21:26 ` Louis Rilling [this message]
2009-11-19 21:29 ` Louis Rilling
2009-11-19 21:32 ` Dave Hansen
2009-11-19 21:44 ` Louis Rilling
2009-11-20 13:51 ` Louis Rilling
2009-11-20 7:29 ` Sukadev Bhattiprolu
[not found] ` <20091120072914.GA4291-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-11-20 9:31 ` Louis Rilling
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091119212646.GA4767@localdomain \
--to=louis.rilling-aw0bnhfmbspbdgjk7y7tuq@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.