* BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL
@ 2004-09-13 12:26 Stephane Eranian
2004-09-13 18:27 ` dann frazier
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Stephane Eranian @ 2004-09-13 12:26 UTC (permalink / raw)
To: linux-ia64
To all,
David and I have tracked down a very nasty bug in the 2.6.8 and higher versions
of the Linux/ia64 kernel. The bug turned out to be due to the compiler. Here is
the description of the problem.
What is affected:
-----------------
- all usage of the PTRACE_SYSCALL facility, such as done by the strace tool.
Which kernel versions:
----------------------
- 2.6.8 and higher with CONFIG_AUDIT turned off
Symptoms:
---------
A program run under strace dies with SIGSEGV whereas it works
perfectly when run by itself.
The traced program would die upon return from system calls such
as brk() or pipe().
Which system call is affected depends on the version of libc and
whether the program is linked statically or shared. Some older libc
stubs may mask the problem unvolontarily.
Why is that happening?
----------------------
When a program is traced with PTRACE_SYSCALL, a stacked register
corruption occurs on the parameters to the system call.s
When returning from the system call some of the parameters to the
system call may be re-used. The kernel normally guarantees that
the parameters are preserved through the call. Because of the bug,
the guarantee is broken and r32 (in0) or other stakced registers may
contain bogus values.
If the libc stub happens to use the parameters upon return from the
system call, it may fail. This is the case, for instance, with
pipe(), where the 2 file descriptors are returned in registers
and libc copies them into the array using the address in r32.
The corruption comes from the fact that the parameters to
the syscall are not preserved. Note that those parameters are
passed directly in registers without any copy. They must be
preserved such that the system call may be restarted with its
initial parameters when needed.
The constraint is enforced by a special function attribute
called syscall_linkage. In the kernel it is used via the
"asmlinkage" macro. When the compiler sees the attribute,
it treats all parameters to the function as read-only. Any
modification requires making a copy first.
In 2.6.8 new auditing code has been added to the kernel
including on the PTRACE_SYSCALL path. The call path to
the syscall_trace() function in ia64/kernel/ptrace.c has
been modified and two new functions syscall_trace_enter()
and syscall_trace_leave() have been added. Both functions
do have the asmlinkage macro because they are directly exposed
to the user level system call parameters.
When the auditing system is not configured, both enter and
leave functions are very simple and boil down to calling
the old syscall_trace() function. This function has lost its
syscall_linkage attribute because it is, in theory, never directly
exposed to the user level syscall parameters anymore. This function
has no parameter but it uses the stacked registers for locals.
The problem is that the compiler performs a sibling call
optimization between syscall_trace_leave() and syscall_trace()
because syscall_trace() is at the very end of the function.
That means that syscall_trace_leave() directly branches to
syscall_trace() using a br.may instead of the typical br.call.
This is perfectly legal because the stacked registers of
syscall_trace_leave() are now considered "dead" because we are
at the very end of the function and it has no return value.
Then syscall_trace() returns to the parent of syscall_trace_leave()
directly. With this optimization you save a br.ret.
The br.many does not cause any RSE activity, hence the user level
syscall parameters are now directly exposed to syscall_trace() which
rightfully modifies them thereby corrupting the registers for the libc
stub. The alloc instruction in that function simply resizes the frame
and that does not protect the syscall parameters.
The bug is that the compiler performs the sibling call
optimization and breaks the guarantee offered by the syscall_linkage
attribute.
For such a function, the compiler should not attempt
the optimization because it cannot guarantee that the callee
does not modify the registers.
How to fix the problem?
-----------------------
The kernel must be compiled with sibling call optimization turned off.
This is accomplish by adding the -fno-optimize-sibling-calls to the
CFLAGS in arch/ia64/Makefile
A bug has been filed for gcc. A patch for the Makefile has been submitted
to Tony Luck.
--
-Stephane
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL
2004-09-13 12:26 BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL Stephane Eranian
@ 2004-09-13 18:27 ` dann frazier
2004-09-13 19:09 ` Luck, Tony
2004-09-16 2:03 ` Peter Chubb
2 siblings, 0 replies; 4+ messages in thread
From: dann frazier @ 2004-09-13 18:27 UTC (permalink / raw)
To: linux-ia64
On Mon, Sep 13, 2004 at 05:26:56AM -0700, Stephane Eranian wrote:
> To all,
>
>
> David and I have tracked down a very nasty bug in the 2.6.8 and higher versions
> of the Linux/ia64 kernel. The bug turned out to be due to the compiler. Here is
> the description of the problem.
Thanks for tracking this down. If I'm affected, should I expect something
like ls to SEGV when run under strace? If so, I'm not seeing this problem with
the Debian kernels. They are built w/ gcc 3.3.4 and have CONFIG_AUDIT
disabled. I'm running Debian/sid which uses libc6.1 2.3.2.ds1-16.
--
---------------------------
dann frazier
Hewlett-Packard
Linux and Open Source Lab
dannf@hp.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL
2004-09-13 12:26 BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL Stephane Eranian
2004-09-13 18:27 ` dann frazier
@ 2004-09-13 19:09 ` Luck, Tony
2004-09-16 2:03 ` Peter Chubb
2 siblings, 0 replies; 4+ messages in thread
From: Luck, Tony @ 2004-09-13 19:09 UTC (permalink / raw)
To: linux-ia64
>Thanks for tracking this down. If I'm affected, should I
>expect something like ls to SEGV when run under strace?
>If so, I'm not seeing this problem with the Debian kernels.
>They are built w/ gcc 3.3.4 and have CONFIG_AUDIT disabled.
>I'm running Debian/sid which uses libc6.1 2.3.2.ds1-16.
The test program that Stephane showed to me just did:
int pv[2];
pipe(pv);
printf("%d %d\n", pv[0], pv[1]);
This coredumps when run under strace on 2.6.8 or later kernel.
The fix is to add -fno-optimize-sibling-calls to cflags-y
in arch/ia64/Makefile. I just put it into my tree.
-Tony
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL
2004-09-13 12:26 BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL Stephane Eranian
2004-09-13 18:27 ` dann frazier
2004-09-13 19:09 ` Luck, Tony
@ 2004-09-16 2:03 ` Peter Chubb
2 siblings, 0 replies; 4+ messages in thread
From: Peter Chubb @ 2004-09-16 2:03 UTC (permalink / raw)
To: linux-ia64
>>>>> "Stephane" = Stephane Eranian <eranian@hpl.hp.com> writes:
Stephane> To all, David and I have tracked down a very nasty bug in
Stephane> the 2.6.8 and higher versions of the Linux/ia64 kernel. The
Stephane> bug turned out to be due to the compiler. Here is the
Stephane> description of the problem.
Thanksyou thankyou thankyou.... this also fixes a very similar problem
on ARM.
--
Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au
The technical we do immediately, the political takes *forever*
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-09-16 2:03 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-13 12:26 BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL Stephane Eranian
2004-09-13 18:27 ` dann frazier
2004-09-13 19:09 ` Luck, Tony
2004-09-16 2:03 ` Peter Chubb
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox