From: Denys Vlasenko <dvlasenk@redhat.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: linux-kernel@vger.kernel.org, Andi Kleen <andi@firstfloor.org>,
"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH] x86: make PTRACE_GETREGSET return 32-bit regs if 64-bit process entered kernel with int 80
Date: Thu, 14 Feb 2013 17:26:50 +0100 [thread overview]
Message-ID: <511D104A.1040204@redhat.com> (raw)
In-Reply-To: <20130214150046.GA30543@redhat.com>
On 02/14/2013 04:00 PM, Oleg Nesterov wrote:
> On 02/14, Denys Vlasenko wrote:
>> This patch makes it so that in syscall-entry-stop caused by
>> "int 80" instruction, PTRACE_GETREGSET returns 32-bit regset.
>
> Not sure...
>
> First of all, this is incompatible change. And to me, it doesn't look
> correct anyway. Say, why the debugger can't modify r15 if a 64bit tracee
> does int80?
On x86_64, PTRACE_GET/SETREGS can be used for this: they always operate
on 64-bit registers.
> Or think about PTRACE_EVENT_FORK which can be reported with
> TS_COMPAT set.
I don't see a problem. Yes, PTRACE_GETREGSET will return
32-bit regset in this ptrace-stop, which is a problem... why?
> Probably is_ia32_task() should be reported "explicitely" as we already
> discussed, and afaik you have other ideas.
Yes, there are a few ideas. Say, new ptrace op:
can be introduced to return a vector of longs.
To make it easily parsable, how about (type,len,data...)
records? This also may allow tracer to indicate which records
it wants: for example, not everyone wants to read syscall params.
Maybe something like
ptrace(PTRACE_GETSYSCALL, pid, list_of_elements_I_want, &iov)
where list_of_elements_I_want is long[], 0-terminated,
iov points to a buffer, and on return iov.len is updated
(a-la GETREGSET).
What problems can be solved here?
(1) syscall entry/exit discrimination etc. Say, a record
can contain bit flags, such as "it's a syscall entry",
"it's a syscall exit", "it's a group-stop" etc.
Currently, it is impossible for tracer to distinguish
syscall entry from syscall exit.
(2) a record can supply arch-specific data, such as the x86-specific
problem I tried to address: "was it a int 80 syscall?". Variable-length
record format makes it easy to adapt to different archs' needs.
Alternatively, we can set aside a few bits in "bit flags record"
as arch dependent bits. Most arches need just a few bits.
(3) on syscall entry, a record can contain (up to) 7 words: syscall_no
and 0-6 params, making tracer's code less architecture dependent.
Today in strace, *every* architecture needs to have arch-dependent
regs-to-params conversion code. I would like to be able
to code it in C with the same code for most arches.
(4) We can read structs/data pointed by syscall params, such as
struct stat returned by fstat, without needing additional
round-trip to kernel, *and* with kernel-supplied information
on structure's size. Currently, strace has to know the size correctly.
There were, and will be, bugs in strace where we mishandle
structures because we mis-detect process' bitness, and use wrong
struct stat layouts. If kernel would be able to tell us:
"I returned 78 byte structure in memory pointed to by arg1",
it would help a lot. Even if it wouldn't return the result
structure itself (I imagine it's a lot of work in kernel
to access it in other process vm), knowing its size
will still be a big help.
(5) We can read several regsets: gps, SSE regs, etc.
Maybe someone would find it useful? (strace doesn't need this).
How does this look?
I propose to start small, by implementing just 1; 2 in a form of arch bits
as part of 1; and 3. It will satisfy the needs I tried to address
in my patch.
--
vda
next prev parent reply other threads:[~2013-02-14 16:26 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-14 13:17 [PATCH] x86: make PTRACE_GETREGSET return 32-bit regs if 64-bit process entered kernel with int 80 Denys Vlasenko
2013-02-14 15:00 ` Oleg Nesterov
2013-02-14 16:26 ` Denys Vlasenko [this message]
2013-02-14 18:05 ` H. Peter Anvin
2013-02-14 19:18 ` Oleg Nesterov
2013-02-14 19:21 ` H. Peter Anvin
2013-02-14 20:55 ` Cyrill Gorcunov
2013-02-15 14:50 ` Denys Vlasenko
2013-02-15 14:56 ` Cyrill Gorcunov
2013-02-15 15:09 ` Oleg Nesterov
2013-02-15 15:16 ` Cyrill Gorcunov
2013-02-15 15:42 ` Denys Vlasenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=511D104A.1040204@redhat.com \
--to=dvlasenk@redhat.com \
--cc=andi@firstfloor.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.