public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Lutomirski <luto@MIT.EDU>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	mingo@redhat.com, Richard Weinberger <richard@nod.at>,
	user-mode-linux-devel@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Subject: Re: SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386)
Date: Tue, 23 Aug 2011 01:10:41 -0400	[thread overview]
Message-ID: <4E533651.8070205@mit.edu> (raw)
In-Reply-To: <4E51E325.2050502@zytor.com>

On 08/22/2011 01:03 AM, H. Peter Anvin wrote:
> On 08/21/2011 09:26 PM, Al Viro wrote:
>> On Sun, Aug 21, 2011 at 09:11:54PM -0700, H. Peter Anvin wrote:
>>>> lack of point - the *only* CPU where it would matter would be K6-2, IIRC,
>>>> and (again, IIRC) it had some differences in SYSCALL semantics compared to
>>>> K7 (which supports SYSENTER as well).  Bugger if I remember what those
>>>> differences might've been...  Some flag not cleared?
>>>
>>> The most likely reason for a binary to execute a stray SYSCALL is
>>> because they read it out of the vdso.  Totally daft, but we certainly
>>> see a lot of stupid things as evidenced by the JIT thread earlier this
>>> month.
>>
>> Um...  What, blindly, no matter what surrounds it in there?  What will
>> happen to the same eager JIT when it steps on SYSENTER?
> 
> The JIT will have had to manage SYSENTER already.  It's not a change,
> whereas SYSCALL would be.  We could just try it, and see if anything
> breaks, of course.

Here's a possible solution that works for standalone SYSCALL and vdso
SYSCALL.  The idea is to preserve the exact same SYSCALL invocation
sequence.  Logically, the SYSCALL instruction does:

push %ebp
mov %ebp,%ecx
mov 4(%esp),%ebp
call __fake_int80

and __fake_int80 is:
int 0x80
mov 4(%esp),%ebp
ret $4


The entire system call sequence is then (effectively):

push %ebp
movl %ecx,%ebp

; "SYSCALL" starts here
push %ebp
mov %ebp,%ecx
mov 4(%esp),%ebp
call __fake_int80
; "SYSCALL ends here

movl %ebp,%ecx
popl %ebp
ret

So we rearrange ebp and ecx and then immediately rearrange them back.
The landing point tweaks them again so that we preserve the old
semantics of SYSCALL.  But now the pt_regs values exactly match what
would have happened if we entered via the int 0x80 path, so there
shouldn't be any corner cases with ptrace or restart -- as far as either
one is concerned, we actually entered via int 0x80.  If we deliver a
signal, the signal handler returns to the int 0x80 instruction.

Am I missing something?  Extremely buggy, incomplete code that
implements this is:


diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index a0e866d..6cda8ce 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -291,24 +291,59 @@ ENTRY(ia32_cstar_target)
 	ENABLE_INTERRUPTS(CLBR_NONE)
 	SAVE_ARGS 8,0,0
 	movl 	%eax,%eax	/* zero extension */
-	movq	%rax,ORIG_RAX-ARGOFFSET(%rsp)
-	movq	%rcx,RIP-ARGOFFSET(%rsp)
-	CFI_REL_OFFSET rip,RIP-ARGOFFSET
-	movq	%rbp,RCX-ARGOFFSET(%rsp) /* this lies slightly to ptrace */
-	movl	%ebp,%ecx
+
+	/*
+	 * This does (from the user's point of view):
+	 * push %ebp
+	 * mov %ebp, %ecx
+	 * mov 4(%esp), %ebp
+	 * call <function that does int 0x80; mov 4(%esp),%ebp; ret 4>
+	 *
+	 * User address access does not need access_ok check as r8
+	 * has been zero-extended, so even with the offsets it cannot
+	 * exceed 2**32 + 8.
+	 */
+
+	/* XXX: need to check that vdso actually exists. */
+	/* XXX: ia32_badarg may do bad things to the user state. */
+
+	/* move ebp into place on the user stack */
+	1:	movl	%ebp,-4(%r8)
+	.section __ex_table,"a"
+	.quad 1b,ia32_badarg
+	.previous
+
+	/* move eip into place on the user stack */
+	1:	movl	%ecx,-8(%r8)  /* user eip is in ecx */
+	.section __ex_table,"a"
+	.quad 1b,ia32_badarg
+	.previous
+
+	/* move ebp to ecx in CPU registers and argument save area */
+	mov %ebp,%ecx
+	movq %ecx,RCX-ARGOFFSET(%rsp)
+
+	/*
+	 * move arg6 to ebp in CPU registers and argument save area
+	 * minor optimization: the actual value of ebp is irrelevent,
+	 * so stick it straight into r9d -- see the definition of
+	 * IA32_ARG_FIXUP.
+	 */
+1:	movl	(%r8),%r9d
+	.section __ex_table,"a"
+	.quad 1b,ia32_badarg
+	.previous	
+
+	/* Do the fake call */
+	movl [insert address of int 0x80; ret helper + 2 here],RIP-ARGOFFSET(%rsp)
+	subl $8,%r8 /* we pushed twice */
+
 	movq	$__USER32_CS,CS-ARGOFFSET(%rsp)
 	movq	$__USER32_DS,SS-ARGOFFSET(%rsp)
 	movq	%r11,EFLAGS-ARGOFFSET(%rsp)
 	/*CFI_REL_OFFSET rflags,EFLAGS-ARGOFFSET*/
 	movq	%r8,RSP-ARGOFFSET(%rsp)	
 	CFI_REL_OFFSET rsp,RSP-ARGOFFSET
-	/* no need to do an access_ok check here because r8 has been
-	   32bit zero extended */ 
-	/* hardware stack frame is complete now */	
-1:	movl	(%r8),%r9d
-	.section __ex_table,"a"
-	.quad 1b,ia32_badarg
-	.previous	
 	GET_THREAD_INFO(%r10)
 	orl   $TS_COMPAT,TI_status(%r10)
 	testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%r10)
diff --git a/arch/x86/vdso/vdso32/syscall.S b/arch/x86/vdso/vdso32/syscall.S
index 5415b56..a3e48b0 100644
--- a/arch/x86/vdso/vdso32/syscall.S
+++ b/arch/x86/vdso/vdso32/syscall.S
@@ -19,8 +19,8 @@ __kernel_vsyscall:
 .Lpush_ebp:
 	movl	%ecx, %ebp
 	syscall
-	movl	$__USER32_DS, %ecx
-	movl	%ecx, %ss
+	/* The ret in the fake int80 entry lands here */
+	/* ss is already correct AFAICS */
 	movl	%ebp, %ecx
 	popl	%ebp
 .Lpop_ebp:
@@ -28,6 +28,11 @@ __kernel_vsyscall:
 .LEND_vsyscall:
 	.size __kernel_vsyscall,.-.LSTART_vsyscall
 
+__kernel_vsyscall_fake_int80:
+	int 0x80
+	mov 4(%esp),%ebp
+	ret $4
+
 	.section .eh_frame,"a",@progbits
 .LSTARTFRAME:
 	.long .LENDCIE-.LSTARTCIE


This could be further simplified by checking if any work flags are set and bailing immediately to the right place in the int 0x80 entry.

--Andy

      reply	other threads:[~2011-08-23  5:10 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-18 18:58 Subject: [PATCH 00/91] pending uml patches Al Viro
2011-08-18 19:12 ` Richard Weinberger
2011-08-18 19:19   ` Al Viro
2011-08-19  4:31     ` Al Viro
2011-08-19  8:51       ` Richard Weinberger
2011-08-20  1:18         ` [RFC] weird crap with vdso on uml/i386 Al Viro
2011-08-20 15:22           ` Richard Weinberger
2011-08-20 20:14             ` Al Viro
2011-08-20 20:55               ` Richard Weinberger
2011-08-20 21:26                 ` Andrew Lutomirski
2011-08-20 21:38                   ` Richard Weinberger
2011-08-20 21:40                   ` Andrew Lutomirski
2011-08-21  6:34                     ` Al Viro
2011-08-21  8:42                       ` SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386) Al Viro
2011-08-21 11:24                         ` Andrew Lutomirski
2011-08-21 13:37                           ` Andrew Lutomirski
2011-08-21 14:51                             ` Al Viro
2011-08-21 14:43                           ` Al Viro
2011-08-21 16:41                             ` Al Viro
2011-08-22  0:44                               ` Andrew Lutomirski
2011-08-22  1:09                                 ` Linus Torvalds
2011-08-22  1:19                                   ` Al Viro
2011-08-22  1:19                                   ` H. Peter Anvin
2011-08-22 21:25                                   ` [tip:x86/urgent] x86-32, vdso: On system call restart after SYSENTER, use int $0x80 tip-bot for H. Peter Anvin
2011-08-23 23:40                                   ` tip-bot for H. Peter Anvin
2011-08-22  1:16                                 ` SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386) Al Viro
2011-08-22  1:41                                   ` Linus Torvalds
2011-08-22  1:48                                     ` H. Peter Anvin
2011-08-22  2:01                                       ` Andrew Lutomirski
2011-08-22  2:07                                         ` Al Viro
2011-08-22  2:26                                           ` Andrew Lutomirski
2011-08-22  2:34                                             ` H. Peter Anvin
2011-08-22  4:05                                             ` H. Peter Anvin
2011-08-22  9:53                                               ` [uml-devel] " Ingo Molnar
2011-08-22 13:34                                                 ` Andrew Lutomirski
2011-08-22 14:40                                                   ` Borislav Petkov
2011-08-22 15:13                                                     ` Al Viro
2011-08-22 20:05                                                       ` Linus Torvalds
2011-08-22 20:11                                                         ` H. Peter Anvin
2011-08-22 21:52                                                           ` Andrew Lutomirski
2011-08-22 22:04                                                             ` H. Peter Anvin
2011-08-22 23:27                                                               ` Linus Torvalds
2011-08-22 23:46                                                                 ` H. Peter Anvin
2011-08-23  0:03                                                                 ` Al Viro
2011-08-23  0:07                                                                   ` Al Viro
2011-08-23  0:07                                                                   ` H. Peter Anvin
2011-08-23  0:22                                                                     ` Linus Torvalds
2011-08-23  1:01                                                                       ` Al Viro
2011-08-23  1:13                                                                         ` Al Viro
2011-08-23  1:59                                                                           ` Linus Torvalds
2011-08-23  2:59                                                                             ` Al Viro
2011-08-23  2:17                                                                           ` Al Viro
2011-08-23  6:15                                                                             ` Al Viro
2011-08-23 14:26                                                                               ` Borislav Petkov
2011-08-23 16:30                                                                                 ` Al Viro
2011-08-23 16:03                                                                               ` Linus Torvalds
2011-08-23 16:11                                                                                 ` Andrew Lutomirski
2011-08-23 16:20                                                                                   ` Linus Torvalds
2011-08-23 17:33                                                                                     ` Al Viro
2011-08-23 18:04                                                                                       ` Al Viro
2011-08-24 12:44                                                                                       ` [PATCH] x86, asm: Document some of the syscall asm glue Borislav Petkov
2011-08-23 16:22                                                                                   ` [uml-devel] SYSCALL, ptrace and syscall restart breakages (Re: [RFC] weird crap with vdso on uml/i386) Borislav Petkov
2011-08-23 16:29                                                                                     ` Linus Torvalds
2011-08-23 16:53                                                                                       ` Al Viro
2011-08-23 16:58                                                                                         ` Richard Weinberger
2011-08-23 17:07                                                                                           ` Al Viro
2011-08-23 17:29                                                                                             ` Richard Weinberger
2011-08-25  0:05                                                                                             ` Richard Weinberger
2011-08-23 19:15                                                                                     ` H. Peter Anvin
2011-08-23 20:56                                                                                       ` Borislav Petkov
2011-08-23 21:06                                                                                         ` H. Peter Anvin
2011-08-23 21:10                                                                                           ` Borislav Petkov
2011-08-23 23:04                                                                                             ` H. Peter Anvin
2011-08-24 21:10                                                                                             ` H. Peter Anvin
2011-08-23 16:48                                                                                 ` Al Viro
2011-08-23 17:33                                                                                   ` Linus Torvalds
2011-08-23 21:08                                                                                     ` H. Peter Anvin
2011-08-23 21:20                                                                                       ` Linus Torvalds
2011-08-23 23:04                                                                                         ` H. Peter Anvin
2011-08-23 19:18                                                                                   ` H. Peter Anvin
2011-08-23 19:24                                                                                     ` Linus Torvalds
2011-08-23 19:26                                                                                       ` H. Peter Anvin
2011-08-23 19:41                                                                                       ` Al Viro
2011-08-23 19:43                                                                                         ` Linus Torvalds
2011-08-23 21:17                                                                                           ` Al Viro
     [not found]                                                                         ` <CAObL_7FG8eFTZ4djKH0T8tbRf2h6+iOm=OXr8194nvzc+w+a9A@mail.gmail.com>
2011-08-23  1:18                                                                           ` H. Peter Anvin
2011-08-22  4:07                                     ` Al Viro
2011-08-22  4:11                                       ` H. Peter Anvin
2011-08-22  4:26                                         ` Al Viro
2011-08-22  5:03                                           ` H. Peter Anvin
2011-08-23  5:10                                             ` Andrew Lutomirski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E533651.8070205@mit.edu \
    --to=luto@mit.edu \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=richard@nod.at \
    --cc=torvalds@linux-foundation.org \
    --cc=user-mode-linux-devel@lists.sourceforge.net \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox