linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Denys Vlasenko <vda.linux@googlemail.com>
To: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
	Jim Keniston <jkenisto@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	Anton Arapov <aarapov@redhat.com>,
	David Long <dave.long@linaro.org>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	Jonathan Lebon <jlebon@redhat.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 4/6] uprobes/x86: Emulate rip-relative call's
Date: Thu, 10 Apr 2014 19:02:38 +0200	[thread overview]
Message-ID: <201404101902.38639.vda.linux@googlemail.com> (raw)
In-Reply-To: <5346AB07.4090909@redhat.com>

On Thursday 10 April 2014 16:30, Denys Vlasenko wrote:
> On 04/10/2014 04:18 PM, Oleg Nesterov wrote:
> > On 04/10, Denys Vlasenko wrote:
> >>
> >> There is this monstrosity, "16-bit override for branches" in 64-mode:
> >>
> >> 66 e8 nn nn       callw   <offset16>
> >>
> >> Nobody sane uses it because it truncates instruction pointer.
> >>
> >> Or rather, *I think* it should truncate it (i.e. zero-extend to full width),
> >> but conceivably some CPUs can be buggy wrt that:
> >> they can decide to modify only lower 16 bits of IP,
> >> or even they can decided to use signed <offset16> but apply it
> >> to full-width RIP.
> >>
> >> AMD manuals are not clear on what exactly should happen.
> >>
> >> I am sure no one sane uses this form of branch instructions
> >> in 32-bit and 64-bit code.
> >>
> >> I don't think we should be trying to support it "correctly"
> >> (we can just let program crash with SIGILL or something),
> >> we only need to make sure we don't overlook its existence
> >> and thus are not tricked into touching or modifying unrelated data.
> > 
> > And after the quick check it seems that lib/insn.c doesn't parse
> > "66 e8 nn nn" correctly. It treats the next 2 bytes as the part
> > of 32bit offset.
> 
> I didn't run-test it yet. By code inspection, it seems to work...
> 
> x86-opcode-map.txt:
>     e8: CALL Jz (f64)
> 
> gen-insn-attr-x86.awk:
>     imm_flag["Jz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)"
> 
> 
> insn.c:
>         case INAT_IMM_VWORD32:
>                 if (!__get_immv32(insn))
>                         goto err_out;
> ...
> static int __get_immv32(struct insn *insn)
> {
>         switch (insn->opnd_bytes) {
>         case 2:
>                 insn->immediate.value = get_next(short, insn);
>                 insn->immediate.nbytes = 2;
>                 break;
>         case 4:
>         case 8:
>                 insn->immediate.value = get_next(int, insn);
>                 insn->immediate.nbytes = 4;
>                 break;
> 
> 
> ...until I notice this code:
> 
> void insn_get_modrm(struct insn *insn)
> {
> ...
>         if (insn->x86_64 && inat_is_force64(insn->attr))
>                 insn->opnd_bytes = 8;
> 
> 
> The (f64) modifier in x86-opcode-map.txt means that inat_is_force64()
> is true for call opcode. So we won't reach "case 2:" in __get_immv32():
> insn_get_prefixes() did set insn->opnd_bytes to 2 when it saw 0x66 prefix,
> but it was before we reach this place, and here we overrode it.
> This is a bug in insn decoder.

I tested it on both Intel and AMD CPUs and my worst fears came true:
this instruction has different widths on different CPUs.

This program:

# compile with: gcc -nostartfiles -nostdlib -o int3 int3.S
_start:         .globl  _start
                .byte 0x66,0xe9,0,0
                .byte 0,0
1: jmp 1b

compiles to this:

00000000004000b0 <_start>:
  4000b0:       66 e9 00 00             jmpw   b4 <_start-0x3ffffc>
  4000b4:       00 00                   add    %al,(%rax)
  4000b6:       eb fe                   jmp    4000b6 <_start+0x6>

and it will reach 0x4000b6 on Intel CPU.
IOW, Intel SandyBridge CPU thinks that insn is in fact 66 e9 00 00 00 00,
no RIP truncation occurs.

On AMD K10 CPU, the very same binary jumps to 0x00b4
and gets SIGSEGV with MAPERR.
AMD thinks that the insn is 66 e9 00 00 as shown above.

Thus, insn.c decoder implements Intel's idea of this insn
while binutils (objdump, gdb) implement AMD decode.

This same program can be compiled to 32-bit code,
in this mode both CPUs treat insn as 66 e9 00 00.

Oleg, I'm sure you are very sympathetic by now to the idea
of just not supporting this insn at all. ;)

You can check whether insn had any prefix by checking
insn->prefixes->nbytes != 0...

..but there is a problem with that. P4 introduced branch hints,
which are implemented using segment prefixes on conditional jumps.
Meaning that some compilers produce

2e 0f 82 nn nn nn nn

as   (hint not taken) JB <offset32>   insn.

2e is CS segment prefix. insn->prefixes->nbytes == 1 for this insn.
DS prefix (3e) hints that branch is taken.

They were nearly useless on P4 anyway and ignored by all CPUs
before or since, but they can be seen in some programs.

Looks like we'll need this:

/*
 * 16-bit overrides such as CALLW (66 e8 nn nn) are not supported.
 * Intel and AMD behavior differ in 64-bit mode: Intel ignores 66 prefix.
 * No one uses these insns.
 * To filter them out, reject any branch insns with prefixes...
 */
if (insn->prefixes->nbytes > 1)
	bail_out;
/*
 * ...Except a single 3e or 2e "branch taken/not taken" hint prefix.
 * These are (rarely) used, but ignored by any CPU except P4.
 * Example: 2e 0f 82 nn nn nn nn  is JB,PN <offset32>
 */
if (insn->prefixes->nbytes == 1 && (insn->prefixes->bytes[0] | 0x10) != 0x3e))
	bail_out;



-- 
vda

  reply	other threads:[~2014-04-10 17:03 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-04 18:50 [PATCH v2 0/9] uprobes/x86: preparations to fix the reprel jmp/call handling Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 1/9] uprobes: Kill UPROBE_SKIP_SSTEP and can_skip_sstep() Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 2/9] uprobes/x86: Fold prepare_fixups() into arch_uprobe_analyze_insn() Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 3/9] uprobes/x86: Kill the "ia32_compat" check in handle_riprel_insn(), remove "mm" arg Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 4/9] uprobes/x86: Gather "riprel" functions together Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 5/9] uprobes/x86: move the UPROBE_FIX_{RIP,IP,CALL} code at the end of pre/post hooks Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 6/9] uprobes/x86: Introduce uprobe_xol_ops and arch_uprobe->ops Oleg Nesterov
2014-04-08  9:10   ` Masami Hiramatsu
2014-04-08 16:10     ` Oleg Nesterov
2014-04-08 18:10       ` Oleg Nesterov
2014-04-09 12:58       ` Masami Hiramatsu
2014-04-09 16:55         ` Oleg Nesterov
2014-04-10 13:58           ` Masami Hiramatsu
2014-04-04 18:51 ` [PATCH v2 7/9] uprobes/x86: Conditionalize the usage of handle_riprel_insn() Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 8/9] uprobes/x86: Send SIGILL if arch_uprobe_post_xol() fails Oleg Nesterov
2014-04-04 18:51 ` [PATCH v2 9/9] uprobes/x86: Teach arch_uprobe_post_xol() to restart if possible Oleg Nesterov
2014-04-04 21:56   ` Jim Keniston
2014-04-05 12:46     ` Oleg Nesterov
2014-04-04 19:32 ` [PATCH v2 0/9] uprobes/x86: preparations to fix the reprel jmp/call handling Oleg Nesterov
2014-04-04 19:52   ` Oleg Nesterov
2014-04-04 23:44   ` Jim Keniston
2014-04-06 20:15     ` [RFC PATCH 0/6] uprobes/x86: " Oleg Nesterov
2014-04-06 20:16       ` [RFC PATCH 1/6] uprobes/x86: Emulate unconditional rip-relative jmp's Oleg Nesterov
2014-04-08 20:36         ` Jim Keniston
2014-04-09 14:47           ` Oleg Nesterov
2014-04-06 20:16       ` [RFC PATCH 2/6] uprobes/x86: Emulate nop's using ops->emulate() Oleg Nesterov
2014-04-06 20:16       ` [RFC PATCH 3/6] uprobes/x86: Introduce sizeof_long(), cleanup adjust_ret_addr() and arch_uretprobe_hijack_return_addr() Oleg Nesterov
2014-04-07 20:34         ` Jim Keniston
2014-04-07 20:42           ` Jim Keniston
2014-04-06 20:16       ` [RFC PATCH 4/6] uprobes/x86: Emulate rip-relative call's Oleg Nesterov
2014-04-08 22:26         ` Jim Keniston
2014-04-09 15:43           ` Oleg Nesterov
2014-04-09 21:25             ` Jim Keniston
2014-04-10  4:05               ` Jim Keniston
2014-04-10 13:41             ` Denys Vlasenko
2014-04-10 13:57               ` Masami Hiramatsu
2014-04-10 14:20                 ` Denys Vlasenko
2014-04-11  3:03                   ` Masami Hiramatsu
2014-04-11 12:23                     ` Denys Vlasenko
2014-04-14 14:22                       ` Masami Hiramatsu
2014-04-18 15:17                         ` Denys Vlasenko
2014-04-10 14:28                 ` Oleg Nesterov
2014-04-10 17:00                   ` Oleg Nesterov
2014-04-11  2:38                     ` Masami Hiramatsu
2014-04-11  1:29                   ` Masami Hiramatsu
2014-04-10 14:18               ` Oleg Nesterov
2014-04-10 14:30                 ` Denys Vlasenko
2014-04-10 17:02                   ` Denys Vlasenko [this message]
2014-04-14  5:14                     ` Masami Hiramatsu
2014-04-14 12:24                       ` Denys Vlasenko
2014-04-14 14:05                         ` Masami Hiramatsu
2014-04-06 20:16       ` [RFC PATCH 5/6] uprobes/x86: Emulate rip-relative conditional "short" jmp's Oleg Nesterov
2014-04-07 14:27         ` [RFC PATCH v2 " Oleg Nesterov
2014-04-07 16:41           ` Oleg Nesterov
2014-04-08 22:53           ` Jim Keniston
2014-04-09 16:42             ` Oleg Nesterov
2014-04-06 20:16       ` [RFC PATCH 6/6] uprobes/x86: Emulate rip-relative conditional "near" jmp's Oleg Nesterov
2014-04-07 14:28         ` Oleg Nesterov
2014-04-08 23:07           ` Jim Keniston
2014-04-09 16:50             ` Oleg Nesterov
2014-04-07 18:54       ` [RFC PATCH 0/6] uprobes/x86: fix the reprel jmp/call handling Jim Keniston
2014-04-08 11:43       ` Masami Hiramatsu
2014-04-08 16:28         ` Oleg Nesterov
2014-04-08 19:26           ` Oleg Nesterov
2014-04-09 19:44       ` [RFC PATCH v2 " Oleg Nesterov
2014-04-09 19:44         ` [RFC PATCH v2 1/6] uprobes/x86: Introduce sizeof_long(), cleanup adjust_ret_addr() and arch_uretprobe_hijack_return_addr() Oleg Nesterov
2014-04-09 19:44         ` [RFC PATCH v2 2/6] uprobes/x86: Emulate unconditional rip-relative jmp's Oleg Nesterov
2014-04-10 12:37           ` Denys Vlasenko
2014-04-10 13:47             ` Oleg Nesterov
2014-04-09 19:44         ` [RFC PATCH v2 3/6] uprobes/x86: Emulate nop's using ops->emulate() Oleg Nesterov
2014-04-09 19:44         ` [RFC PATCH v2 4/6] uprobes/x86: Emulate rip-relative call's Oleg Nesterov
2014-04-10 12:53           ` Denys Vlasenko
2014-04-10 13:15             ` Masami Hiramatsu
2014-04-10 13:41               ` Oleg Nesterov
2014-04-09 19:44         ` [RFC PATCH v2 5/6] uprobes/x86: Emulate rip-relative conditional "short" jmp's Oleg Nesterov
2014-04-09 19:44         ` [RFC PATCH v2 6/6] uprobes/x86: Emulate rip-relative conditional "near" jmp's Oleg Nesterov
2014-04-10 12:49           ` Denys Vlasenko
2014-04-09 21:34         ` [RFC PATCH v2 0/6] uprobes/x86: fix the reprel jmp/call handling Jim Keniston
2014-04-10 12:28         ` Denys Vlasenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201404101902.38639.vda.linux@googlemail.com \
    --to=vda.linux@googlemail.com \
    --cc=aarapov@redhat.com \
    --cc=ananth@in.ibm.com \
    --cc=dave.long@linaro.org \
    --cc=dvlasenk@redhat.com \
    --cc=fche@redhat.com \
    --cc=jkenisto@linux.vnet.ibm.com \
    --cc=jlebon@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).