From: Stephane Eranian <eranian@hpl.hp.com>
To: linux-ia64@vger.kernel.org
Subject: BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL
Date: Mon, 13 Sep 2004 12:26:56 +0000 [thread overview]
Message-ID: <20040913122656.GC30808@frankl.hpl.hp.com> (raw)
To all,
David and I have tracked down a very nasty bug in the 2.6.8 and higher versions
of the Linux/ia64 kernel. The bug turned out to be due to the compiler. Here is
the description of the problem.
What is affected:
-----------------
- all usage of the PTRACE_SYSCALL facility, such as done by the strace tool.
Which kernel versions:
----------------------
- 2.6.8 and higher with CONFIG_AUDIT turned off
Symptoms:
---------
A program run under strace dies with SIGSEGV whereas it works
perfectly when run by itself.
The traced program would die upon return from system calls such
as brk() or pipe().
Which system call is affected depends on the version of libc and
whether the program is linked statically or shared. Some older libc
stubs may mask the problem unvolontarily.
Why is that happening?
----------------------
When a program is traced with PTRACE_SYSCALL, a stacked register
corruption occurs on the parameters to the system call.s
When returning from the system call some of the parameters to the
system call may be re-used. The kernel normally guarantees that
the parameters are preserved through the call. Because of the bug,
the guarantee is broken and r32 (in0) or other stakced registers may
contain bogus values.
If the libc stub happens to use the parameters upon return from the
system call, it may fail. This is the case, for instance, with
pipe(), where the 2 file descriptors are returned in registers
and libc copies them into the array using the address in r32.
The corruption comes from the fact that the parameters to
the syscall are not preserved. Note that those parameters are
passed directly in registers without any copy. They must be
preserved such that the system call may be restarted with its
initial parameters when needed.
The constraint is enforced by a special function attribute
called syscall_linkage. In the kernel it is used via the
"asmlinkage" macro. When the compiler sees the attribute,
it treats all parameters to the function as read-only. Any
modification requires making a copy first.
In 2.6.8 new auditing code has been added to the kernel
including on the PTRACE_SYSCALL path. The call path to
the syscall_trace() function in ia64/kernel/ptrace.c has
been modified and two new functions syscall_trace_enter()
and syscall_trace_leave() have been added. Both functions
do have the asmlinkage macro because they are directly exposed
to the user level system call parameters.
When the auditing system is not configured, both enter and
leave functions are very simple and boil down to calling
the old syscall_trace() function. This function has lost its
syscall_linkage attribute because it is, in theory, never directly
exposed to the user level syscall parameters anymore. This function
has no parameter but it uses the stacked registers for locals.
The problem is that the compiler performs a sibling call
optimization between syscall_trace_leave() and syscall_trace()
because syscall_trace() is at the very end of the function.
That means that syscall_trace_leave() directly branches to
syscall_trace() using a br.may instead of the typical br.call.
This is perfectly legal because the stacked registers of
syscall_trace_leave() are now considered "dead" because we are
at the very end of the function and it has no return value.
Then syscall_trace() returns to the parent of syscall_trace_leave()
directly. With this optimization you save a br.ret.
The br.many does not cause any RSE activity, hence the user level
syscall parameters are now directly exposed to syscall_trace() which
rightfully modifies them thereby corrupting the registers for the libc
stub. The alloc instruction in that function simply resizes the frame
and that does not protect the syscall parameters.
The bug is that the compiler performs the sibling call
optimization and breaks the guarantee offered by the syscall_linkage
attribute.
For such a function, the compiler should not attempt
the optimization because it cannot guarantee that the callee
does not modify the registers.
How to fix the problem?
-----------------------
The kernel must be compiled with sibling call optimization turned off.
This is accomplish by adding the -fno-optimize-sibling-calls to the
CFLAGS in arch/ia64/Makefile
A bug has been filed for gcc. A patch for the Makefile has been submitted
to Tony Luck.
--
-Stephane
next reply other threads:[~2004-09-13 12:26 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-13 12:26 Stephane Eranian [this message]
2004-09-13 18:27 ` BUG: 2.6.8/2.6.9 register corruption with PTRACE_SYSCALL dann frazier
2004-09-13 19:09 ` Luck, Tony
2004-09-16 2:03 ` Peter Chubb
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040913122656.GC30808@frankl.hpl.hp.com \
--to=eranian@hpl.hp.com \
--cc=linux-ia64@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox