public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] [PATCH] Double syscall exit traces on x86_64
@ 2006-05-26  3:24 Jeff Dike
  2006-05-26 10:36 ` [discuss] " Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Dike @ 2006-05-26  3:24 UTC (permalink / raw)
  To: discuss, Andi Kleen
  Cc: linux-kernel, User-mode-linux-devel, Steven James, Roland McGrath,
	Blaisorblade

We are seeing double ptrace notifications of system call returns on recent
x86_64 kernels.  This breaks UML and at least one other app.

The patch below appears to fix the problem.  The bug is caused by both
syscall_trace and int_very_careful both calling syscall_trace_leave,
and the system call tracing path going through int_very_careful.

I would have liked to get rid of one or the other call to
syscall_trace_leave.  However, the syscall_trace path looks like it
can exit to userspace without going through int_very_careful, and
int_very_careful does things other than system call tracing.

So, instead, I took _TIF_SYSCALL_TRACE and _TIF_SYSCALL_AUDIT out of
the flags test on the grounds that they had already been checked in
syscall_trace.  There is possibly a preemption and call to schedule
between syscall_trace and int_very_careful, so if it can be attached
at that point, then the first return will be missed.  However, I think
that ptrace attachment requires a stopped child, not just one that has
been preempted.

I don't see signal delivery between syscall_trace and
int_very_careful, so I don't see that there can be a ptrace attach
followed by int_very_careful missing the first return.

This is an RFC - if it turns out to be actually correct, some comments
need fixing before this goes anywhere.

UML works with this applied, and it doesn't seem to break
singlestepping, either on normal instructions or across system calls,
which looks like the next most vulnerable thing.

				Jeff


Index: linux-2.6.16.x86_64/arch/x86_64/kernel/entry.S
===================================================================
--- linux-2.6.16.x86_64.orig/arch/x86_64/kernel/entry.S
+++ linux-2.6.16.x86_64/arch/x86_64/kernel/entry.S
@@ -345,7 +345,7 @@ int_very_careful:
 	sti
 	SAVE_REST
 	/* Check for syscall exit trace */	
-	testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP),%edx
+	testl $(_TIF_SINGLESTEP),%edx
 	jz int_signal
 	pushq %rdi
 	CFI_ADJUST_CFA_OFFSET 8
@@ -353,7 +353,7 @@ int_very_careful:
 	call syscall_trace_leave
 	popq %rdi
 	CFI_ADJUST_CFA_OFFSET -8
-	andl $~(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP),%edi
+	andl $~(_TIF_SINGLESTEP),%edi
 	cli
 	jmp int_restore_rest
 	

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discuss] [RFC] [PATCH] Double syscall exit traces on x86_64
  2006-05-26  3:24 [RFC] [PATCH] Double syscall exit traces on x86_64 Jeff Dike
@ 2006-05-26 10:36 ` Andi Kleen
  2006-05-26 14:13   ` Jeff Dike
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2006-05-26 10:36 UTC (permalink / raw)
  To: discuss
  Cc: Jeff Dike, linux-kernel, User-mode-linux-devel, Steven James,
	Roland McGrath, Blaisorblade

On Friday 26 May 2006 05:24, Jeff Dike wrote:
> We are seeing double ptrace notifications of system call returns on recent
> x86_64 kernels.  This breaks UML and at least one other app.

I believe this patch is the correct fix. Can you confirm it works for you? 

-Andi

Don't do syscall exit tracing twice

int_ret_from_syscall already does syscall exit tracing, so 
no need to do it again in the caller.

This caused problems for UML and some other special programs doing
syscall interception.

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/arch/x86_64/kernel/entry.S
===================================================================
--- linux.orig/arch/x86_64/kernel/entry.S
+++ linux/arch/x86_64/kernel/entry.S
@@ -282,12 +282,7 @@ tracesys:			 
 	ja  1f
 	movq %r10,%rcx	/* fixup for C */
 	call *sys_call_table(,%rax,8)
-	movq %rax,RAX-ARGOFFSET(%rsp)
-1:	SAVE_REST
-	movq %rsp,%rdi
-	call syscall_trace_leave
-	RESTORE_TOP_OF_STACK %rbx
-	RESTORE_REST
+1:	movq %rax,RAX-ARGOFFSET(%rsp)
 	/* Use IRET because user could have changed frame */
 	jmp int_ret_from_sys_call
 	CFI_ENDPROC

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discuss] [RFC] [PATCH] Double syscall exit traces on x86_64
  2006-05-26 10:36 ` [discuss] " Andi Kleen
@ 2006-05-26 14:13   ` Jeff Dike
  2006-06-01 19:07     ` [uml-devel] " Blaisorblade
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Dike @ 2006-05-26 14:13 UTC (permalink / raw)
  To: Andi Kleen
  Cc: discuss, linux-kernel, User-mode-linux-devel, Steven James,
	Roland McGrath, Blaisorblade

On Fri, May 26, 2006 at 12:36:26PM +0200, Andi Kleen wrote:
> I believe this patch is the correct fix. Can you confirm it works for you? 

Looks good, thanks.

		Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] Re: [discuss] [RFC] [PATCH] Double syscall exit traces on x86_64
  2006-05-26 14:13   ` Jeff Dike
@ 2006-06-01 19:07     ` Blaisorblade
  2006-06-02 15:13       ` [uml-devel] " Jeff Dike
  0 siblings, 1 reply; 7+ messages in thread
From: Blaisorblade @ 2006-06-01 19:07 UTC (permalink / raw)
  To: user-mode-linux-devel
  Cc: Jeff Dike, Andi Kleen, discuss, linux-kernel, Steven James,
	Roland McGrath

On Friday 26 May 2006 16:13, Jeff Dike wrote:
> On Fri, May 26, 2006 at 12:36:26PM +0200, Andi Kleen wrote:
> > I believe this patch is the correct fix. Can you confirm it works for
> > you?
>
> Looks good, thanks.
>
> 		Jeff
Sorry for the question, but has this been sent to -stable (since it's a 
-stable regression, it should be)? To 2.6.17 -git?

And have you tested it (somebody should have, but it's not sure)?
Sorry for not helping myself, I'll be back at work ASAP.
-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

Chiacchiera con i tuoi amici in tempo reale! 
 http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] [discuss] [RFC] [PATCH] Double syscall exit traces on x86_64
  2006-06-01 19:07     ` [uml-devel] " Blaisorblade
@ 2006-06-02 15:13       ` Jeff Dike
  2006-06-02 15:38         ` Steven James
  2006-06-02 17:16         ` Blaisorblade
  0 siblings, 2 replies; 7+ messages in thread
From: Jeff Dike @ 2006-06-02 15:13 UTC (permalink / raw)
  To: Blaisorblade
  Cc: user-mode-linux-devel, discuss, Steven James, Jeff Dike,
	Andi Kleen, linux-kernel, Roland McGrath

On Thu, Jun 01, 2006 at 09:07:33PM +0200, Blaisorblade wrote:
> Sorry for the question, but has this been sent to -stable (since it's a 
> -stable regression, it should be)? To 2.6.17 -git?

It's in current git.

I'm having a hard time telling when the bug was introduced.  The git web
interface seems to be telling me that the double notification was around
since last year, which I don't believe, since I've run much more
recent x86_64 kernels.  If the bug existed before 2.6.16, then it's
fine -stable fodder.

> And have you tested it (somebody should have, but it's not sure)?

Yes, it's been continuously and happily running UML since Andi sent me
his patch.

				Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] [discuss] [RFC] [PATCH] Double syscall exit traces on x86_64
  2006-06-02 15:13       ` [uml-devel] " Jeff Dike
@ 2006-06-02 15:38         ` Steven James
  2006-06-02 17:16         ` Blaisorblade
  1 sibling, 0 replies; 7+ messages in thread
From: Steven James @ 2006-06-02 15:38 UTC (permalink / raw)
  To: Jeff Dike
  Cc: Blaisorblade, user-mode-linux-devel, discuss, Andi Kleen,
	linux-kernel, Roland McGrath

On Fri, 2 Jun 2006, Jeff Dike wrote:

> On Thu, Jun 01, 2006 at 09:07:33PM +0200, Blaisorblade wrote:
>> Sorry for the question, but has this been sent to -stable (since it's a
>> -stable regression, it should be)? To 2.6.17 -git?
>
> It's in current git.
>
> I'm having a hard time telling when the bug was introduced.  The git web
> interface seems to be telling me that the double notification was around
> since last year, which I don't believe, since I've run much more
> recent x86_64 kernels.  If the bug existed before 2.6.16, then it's
> fine -stable fodder.

I know that the bug is NOT present in 2.6.15.7.

>
>> And have you tested it (somebody should have, but it's not sure)?
>
> Yes, it's been continuously and happily running UML since Andi sent me
> his patch.
>
> 				Jeff
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] [discuss] [RFC] [PATCH] Double syscall exit traces on x86_64
  2006-06-02 15:13       ` [uml-devel] " Jeff Dike
  2006-06-02 15:38         ` Steven James
@ 2006-06-02 17:16         ` Blaisorblade
  1 sibling, 0 replies; 7+ messages in thread
From: Blaisorblade @ 2006-06-02 17:16 UTC (permalink / raw)
  To: Jeff Dike
  Cc: user-mode-linux-devel, discuss, Steven James, Andi Kleen,
	linux-kernel, Roland McGrath

[-- Attachment #1: Type: text/plain, Size: 4243 bytes --]

On Friday 02 June 2006 17:13, Jeff Dike wrote:
> On Thu, Jun 01, 2006 at 09:07:33PM +0200, Blaisorblade wrote:
> > Sorry for the question, but has this been sent to -stable (since it's a
> > -stable regression, it should be)? To 2.6.17 -git?
>
> It's in current git.
The patch is likely ok for -stable, but below I express some doubts for 
inclusion in kernel tree - it probably isn't sufficient and maybe it's not 
the better fix.
> I'm having a hard time telling when the bug was introduced.  The git web
> interface seems to be telling me that the double notification was around
> since last year,

Likely you'll looking into the wrong place - int_ret_from_syscall (if this is 
the name) wasn't used till recently, ret_from_syscall was used. And it never 
does syscall exit tracing - it expects the caller to have switched away to 
the slow path.

There's a path in which it calls int_with_check, but it does "movl 
$_TIF_NEED_RESCHED,%edi", so that int_with_check only tests _TIF_NEED_RESCHED 
and no other flag.

However, there's another potential problem in current code, it seems. 
ret_from_fork. It can jump to rff_trace and then again go to 
int_ret_from_sys_call, if _TIF_IA32 is set. The check for tracing/auditing 
should be moved to when we decide to jump to ret_from_sys_call.

I've tried doing that with the attached untested patch, but it's likely I've 
got something wrong - in particular the following pass of the patch wasn't 
easy to get right (this change is buried in the middle of other things).

+       testl $3,CS(%rsp)     # from kernel_thread?
        RESTORE_REST
-       testl $3,CS-ARGOFFSET(%rsp)     # from kernel_thread?

However, I have a bigger doubt and a proposal for a simpler change. Do we need 
at all to test for syscall tracing in int_ret_from_sys_call? And why is there 
a difference with ret_from_sys_call?

Instead of removing the test from tracesys, could we remove it from 
int_ret_from_sys_call (and no, calling int_with_check with a custom %edi 
doesn't work)?

Can syscall tracing be needed on exit and not on entry?

I've the suspect that either:
a) if some other process kicks in, we'll make him trace from next syscall 
start onwards (and int_ret_from_sys_call is buggy) - this makes much more 
sense and works almost always.
b) or if we want tracer to see syscall exit, it only works in 
int_ret_from_sys_call (rare cases).

I've thought to how a process can kick in and answers were:

*) schedule() to a process which does a ptrace attach
*) the syscall is ptrace(PTRACE_TRACEME)

in both cases the ptracer must do PTRACE_SYSCALL, and in both cases we need 
behaviour a).

For singlestep on syscall exit I'm totally unsure. But I've not the time to 
look more this code - I didn't know it and it's very intricate.

> which I don't believe, since I've run much more 
> recent x86_64 kernels.  If the bug existed before 2.6.16, then it's
> fine -stable fodder.
Can you handle sending it to stable@kernel.org?

First introduced in 2.6.16.5 as first diagnosed by Chuck Ebbert:

commit 6b12095a4a0e1f21bbf83f95f13299ca99d758fe
tree 5d2a3d96f7b99a3a225c0f7a110c6631848524b0
parent 59b2832a31ae2f3279bb5b16ae9b1c4e38e40dea
author Andi Kleen <ak@suse.de> Wed, 12 Apr 2006 08:19:29 +0200
committer Greg Kroah-Hartman <gregkh@suse.de> Wed, 12 Apr 2006 13:06:54 -0700

    [PATCH] x86_64: When user could have changed RIP always force IRET 
(CVE-2006-0744)

    Intel EM64T CPUs handle uncanonical return addresses differently from
    AMD CPUs.

    The exception is reported in the SYSRET, not the next instruction.
    Thgis leads to the kernel exception handler running on the user stack
    with the wrong GS because the kernel didn't expect exceptions on this
    instruction.

    This version of the patch has the teething problems that plagued an
    earlier version fixed.

    This is CVE-2006-0744

    Thanks to Ernie Petrides and Asit B. Mallick for analysis and initial
    patches.

    Signed-off-by: Andi Kleen <ak@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

-- 
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade

[-- Attachment #2: x86_64-fix-ret_from_fork --]
[-- Type: text/x-diff, Size: 1507 bytes --]

Index: linux-2.6.16/arch/x86_64/kernel/entry.S
===================================================================
--- linux-2.6.16.orig/arch/x86_64/kernel/entry.S
+++ linux-2.6.16/arch/x86_64/kernel/entry.S
@@ -137,15 +137,17 @@
 ENTRY(ret_from_fork)
 	CFI_DEFAULT_STACK
 	call schedule_tail
+
+	testl $3,CS(%rsp)	# from kernel_thread?
+	je   rff_int_path
 	GET_THREAD_INFO(%rcx)
+	testl $_TIF_IA32,threadinfo_flags(%rcx)
+	jnz  rff_int_path
+
 	testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),threadinfo_flags(%rcx)
 	jnz rff_trace
-rff_action:	
+rff_action:
 	RESTORE_REST
-	testl $3,CS-ARGOFFSET(%rsp)	# from kernel_thread?
-	je   int_ret_from_sys_call
-	testl $_TIF_IA32,threadinfo_flags(%rcx)
-	jnz  int_ret_from_sys_call
 	RESTORE_TOP_OF_STACK %rdi,ARGOFFSET
 	jmp ret_from_sys_call
 rff_trace:
@@ -154,6 +156,9 @@ rff_trace:
 	GET_THREAD_INFO(%rcx)	
 	jmp rff_action
 	CFI_ENDPROC
+rff_int_path:
+	RESTORE_REST
+	jmp  int_ret_from_sys_call
 
 /*
  * System call entry. Upto 6 arguments in registers are supported.
@@ -211,6 +216,7 @@ ENTRY(system_call)
 /*
  * Syscall return path ending with SYSRET (fast path)
  * Has incomplete stack frame and undefined top of stack. 
+ * Will not test _TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP
  */		
 	.globl ret_from_sys_call
 ret_from_sys_call:
@@ -316,6 +322,7 @@ ENTRY(int_ret_from_sys_call)
 	testl $3,CS-ARGOFFSET(%rsp)
 	je retint_restore_args
 	movl $_TIF_ALLWORK_MASK,%edi
+
 	/* edi:	mask to check */
 int_with_check:
 	GET_THREAD_INFO(%rcx)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-06-02 18:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-26  3:24 [RFC] [PATCH] Double syscall exit traces on x86_64 Jeff Dike
2006-05-26 10:36 ` [discuss] " Andi Kleen
2006-05-26 14:13   ` Jeff Dike
2006-06-01 19:07     ` [uml-devel] " Blaisorblade
2006-06-02 15:13       ` [uml-devel] " Jeff Dike
2006-06-02 15:38         ` Steven James
2006-06-02 17:16         ` Blaisorblade

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox