All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@redhat.com>
To: "H. Peter Anvin" <hpa@zytor.com>, Jim Keniston <jkenisto@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	Andi Kleen <andi@firstfloor.org>,
	kvm@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	systemtap-ml <systemtap@sources.redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Vegard Nossum <vegard.nossum@gmail.com>,
	Avi Kivity <avi@redhat.com>, Roland McGrath <roland@redhat.com>
Subject: [RFC] x86 instruction decoder with userspace test code
Date: Mon, 04 May 2009 12:38:47 -0400	[thread overview]
Message-ID: <49FF1A17.5040706@redhat.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2306 bytes --]

Hi,

I've rewritten the x86(-64) instruction decoder with instruction
attribute table and a generator according to Peter's comments.

Currently, an opcode map file (x86-opcode-map.txt) is based on opcode
maps in Intel(R) Software Developers Manual Vol.2: Appendix.A, and it
contains below two types of opcode tables.

1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
---
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
 (or)
opcode: escape # escaped-name
EndTable
---

Group opcodes, which has 8 elements, are written as below;
---
GrpTable: GrpXXX
reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable
---

These opcode maps do NOT include SSE and most of FP opcodes,
because those opcodes are not used in the kernel.

The generator(gen-insn-attr-x86.awk) translates the opcode maps
into a file which defines instruction attribute tables. The instruction
attributes are defined in inat.h and inat.c.


I attached insn decoder with user space test, which was originally
written by Jim. You can test the decoder can decode instruction length,
as following:

> Pull all the attached files into a directory and have a go -- e.g.,
> $ make
> $ objdump -d vmlinux | awk -f distill.awk | ./test_get_len [x86_64]

Known issues:
- 0x9b is an instruction (fwait), but the objdump treats it as a
  prefix.  For example 9b df ... can be disassembled as
	fstsw ...	// wait, then store status word
  or
	fwait		// wait
	fnstsw ...	// store status word without waiting
  and this instruction decoder decode 0x9b as an instruction.

  Anyway, according to Jim's investigation, the single-step stopped
  after the fwait, so it's no problem.

- Illegal instruction sequences(in some data/note sections), such
  as an x86_64 instruction that starts with 0x40, or a misplaced
  0x65 prefix. We can filtered out those instructions which start
  with "rex" or includes "(bad)".


I'll put x86-opcode-map.txt under arch/x86/lib, gen-insn-attr-x86.awk
under arch/x86/scripts/ and generate attribute tables at build time.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


[-- Attachment #2: Makefile --]
[-- Type: text/plain, Size: 328 bytes --]

test_get_len: test_get_len.c insn.c inat.c inat.h insn.h insn_x86_user.h inat-tables.c
	$(CC) -Wall -g test_get_len.c insn.c inat.c -o test_get_len

inat-tables.c: gen-insn-attr-x86.awk x86-opcode-map.txt
	awk -f gen-insn-attr-x86.awk x86-opcode-map.txt > $@

clean:
	rm -f *.o

clobber: clean
	rm -f test_get_len inat-tables.c

[-- Attachment #3: distill.awk --]
[-- Type: text/plain, Size: 714 bytes --]

# Usage: objdump -d a.out | awk -f distill.awk | ./test_get_len
# Distills the disassembly as follows:
# - Removes all lines except the disassembled instructions.
# - For instructions that exceed 1 line (7 bytes), crams all the hex bytes
# into a single line.

BEGIN {
	prev_addr = ""
	prev_hex = ""
	prev_mnemonic = ""
}

/^ *[0-9a-f]+:/ {
	if (split($0, field, "\t") < 3) {
		# This is a continuation of the same insn.
		prev_hex = prev_hex field[2]
	} else {
		if (prev_addr != "")
			printf "%s\t%s\t%s\n", prev_addr, prev_hex, prev_mnemonic
		prev_addr = field[1]
		prev_hex = field[2]
		prev_mnemonic = field[3]
	}
}

END {
	if (prev_addr != "")
		printf "%s\t%s\t%s\n", prev_addr, prev_hex, prev_mnemonic
}

[-- Attachment #4: gen-insn-attr-x86.awk --]
[-- Type: text/plain, Size: 7208 bytes --]

#!/bin/gawk -f

BEGIN {
	print "/* x86 opcode map generated from x86-opcode-map.txt */"
	print "/* Do not change this code. */"
	ggid = 1
	geid = 1

	opnd_expr = "^[A-Za-z]"
	ext_expr = "^\\("
	sep_expr = "^\\|$"
	group_expr = "^Grp[0-9]+A*"

	imm_expr = "^[IJAO][a-z]"
	imm_flag["Ib"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)"
	imm_flag["Jb"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)"
	imm_flag["Iw"] = "INAT_MAKE_IMM(INAT_IMM_WORD)"
	imm_flag["Id"] = "INAT_MAKE_IMM(INAT_IMM_DWORD)"
	imm_flag["Iq"] = "INAT_MAKE_IMM(INAT_IMM_QWORD)"
	imm_flag["Ap"] = "INAT_MAKE_IMM(INAT_IMM_PTR)"
	imm_flag["Iz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)"
	imm_flag["Jz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)"
	imm_flag["Iv"] = "INAT_MAKE_IMM(INAT_IMM_VWORD)"
	imm_flag["Ob"] = "INAT_MOFFSET"
	imm_flag["Ov"] = "INAT_MOFFSET"

	modrm_expr = "^([CDEGMNPQRSUVW][a-z]+|NTA|T[0-2])"
	force64_expr = "\\([df]64\\)"
	rex_expr = "^REX(\\.[XRWB]+)*"
	fpu_expr = "^ESC" # TODO

	lprefix1_expr = "\\(66\\)"
	delete lptable1
	lprefix2_expr = "\\(F2\\)"
	delete lptable2
	lprefix3_expr = "\\(F3\\)"
	delete lptable3
	max_lprefix = 4

	prefix_expr = "\\(Prefix\\)"
	prefix_num["Operand-Size"] = "INAT_PFX_OPNDSZ"
	prefix_num["REPNE"] = "INAT_PFX_REPNE"
	prefix_num["REP/REPE"] = "INAT_PFX_REPE"
	prefix_num["LOCK"] = "INAT_PFX_LOCK"
	prefix_num["SEG=CS"] = "INAT_PFX_CS"
	prefix_num["SEG=DS"] = "INAT_PFX_DS"
	prefix_num["SEG=ES"] = "INAT_PFX_ES"
	prefix_num["SEG=FS"] = "INAT_PFX_FS"
	prefix_num["SEG=GS"] = "INAT_PFX_GS"
	prefix_num["SEG=SS"] = "INAT_PFX_SS"
	prefix_num["Address-Size"] = "INAT_PFX_ADDRSZ"

	delete table
	delete etable
	delete gtable
	eid = -1
	gid = -1
}

function semantic_error(msg) {
	print "Semantic error at " NR ": " msg > "/dev/stderr"
	exit 1
}

function debug(msg) {
	print "DEBUG: " msg
}

function array_size(arr,   i,c) {
	c = 0
	for (i in arr)
		c++
	return c
}

/^Table:/ {
	print "/* " $0 " */"
}

/^Referrer:/ {
	if (NF == 1) {
		# primary opcode table
		tname = "inat_primary_table"
		eid = -1
	} else {
		# escape opcode table
		ref = ""
		for (i = 2; i <= NF; i++)
			ref = ref $i
		eid = escape[ref]
		tname = sprintf("inat_escape_table_%d", eid)
	}
}

/^GrpTable:/ {
	print "/* " $0 " */"
	if (!($2 in group))
		semantic_error("No group: " $2 )
	gid = group[$2]
	tname = "inat_group_table_" gid
}

function print_table(tbl,name,fmt,n)
{
	print "const insn_attr_t " name " = {"
	for (i = 0; i < n; i++) {
		id = sprintf(fmt, i)
		if (tbl[id])
			print "	[" id "] = " tbl[id] ","
	}
	print "};"
}

/^EndTable/ {
	if (gid != -1) {
		# print group tables
		if (array_size(table) != 0) {
			print_table(table, tname "[INAT_GROUP_TABLE_SIZE]",
				    "0x%x", 8)
			gtable[gid,0] = tname
		}
		if (array_size(lptable1) != 0) {
			print_table(lptable1, tname "_1[INAT_GROUP_TABLE_SIZE]",
				    "0x%x", 8)
			gtable[gid,1] = tname "_1"
		}
		if (array_size(lptable2) != 0) {
			print_table(lptable2, tname "_2[INAT_GROUP_TABLE_SIZE]",
				    "0x%x", 8)
			gtable[gid,2] = tname "_2"
		}
		if (array_size(lptable3) != 0) {
			print_table(lptable3, tname "_3[INAT_GROUP_TABLE_SIZE]",
				    "0x%x", 8)
			gtable[gid,3] = tname "_3"
		}
	} else {
		# print primary/escaped tables
		if (array_size(table) != 0) {
			print_table(table, tname "[INAT_OPCODE_TABLE_SIZE]",
				    "0x%02x", 256)
			etable[eid,0] = tname
		}
		if (array_size(lptable1) != 0) {
			print_table(lptable1,tname "_1[INAT_OPCODE_TABLE_SIZE]",
				    "0x%02x", 256)
			etable[eid,1] = tname "_1"
		}
		if (array_size(lptable2) != 0) {
			print_table(lptable2,tname "_2[INAT_OPCODE_TABLE_SIZE]",
				    "0x%02x", 256)
			etable[eid,2] = tname "_2"
		}
		if (array_size(lptable3) != 0) {
			print_table(lptable3,tname "_3[INAT_OPCODE_TABLE_SIZE]",
				    "0x%02x", 256)
			etable[eid,3] = tname "_3"
		}
	}
	print ""
	delete table
	delete lptable1
	delete lptable2
	delete lptable3
	gid = -1
	eid = -1
}

function add_flags(old,new) {
	if (old && new)
		return old " | " new
	else if (old)
		return old
	else
		return new
}

function convert_operands(opnd,       i,imm,mod)
{
	imm = null
	mod = null
	for (i in opnd) {
		i  = opnd[i]
		if (match(i, imm_expr) == 1) {
			if (!imm_flag[i])
				semantic_error("Unknown imm opnd: " i)
			if (imm) {
				if (i != "Ib")
					semantic_error("ADDIMM error")
				imm = add_flags(imm, "INAT_ADDIMM")
			} else
				imm = imm_flag[i] 
		} else if (match(i, modrm_expr))
			mod = "INAT_MODRM" 
	}
	return add_flags(imm, mod)
}

/^[0-9a-f]+\:/ {
	if (NR == 1)
		next
	# get index
	idx = "0x" substr($1, 1, index($1,":") - 1)
	if (idx in table)
		semantic_error("Redefine " idx " in " tname)

	# check if escaped opcode
	if ("escape" == $2) {
		if ($3 != "#")
			semantic_error("No escaped name")
		ref = ""
		for (i = 4; i <= NF; i++)
			ref = ref $i
		if (ref in escape)
			semantic_error("Redefine escape (" ref ")")
		escape[ref] = geid
		geid++
		table[idx] = "INAT_MAKE_ESCAPE(" escape[ref] ")"
		next
	}

	variant = null
	# converts
	i = 2
	while (i <= NF) {
		opcode = $(i++)
		delete opnds
		ext = null
		flags = null
		opnd = null
		# parse one opcode
		if (match($i, opnd_expr)) {
			opnd = $i
			split($(i++), opnds, ",")
			flags = convert_operands(opnds)
		}
		if (match($i, ext_expr))
			ext = $(i++)
		if (match($i, sep_expr))
			i++
		else if (i < NF)
			semantic_error($i " is not a separator")

		# check if group opcode
		if (match(opcode, group_expr)) {
			if (!(opcode in group)) {
				group[opcode] = ggid
				ggid++
			}
			flags = add_flags(flags, "INAT_MAKE_GROUP(" group[opcode] ")")
		}
		# check force(or default) 64bit
		if (match(ext, force64_expr))
			flags = add_flags(flags, "INAT_FORCE64")

		# check REX prefix
		if (match(opcode, rex_expr))
			flags = add_flags(flags, "INAT_REXPFX")

		# check coprocessor escape : TODO
		if (match(opcode, fpu_expr))
			flags = add_flags(flags, "INAT_MODRM")

		# check prefixes
		if (match(ext, prefix_expr)) {
			if (!prefix_num[opcode])
				semantic_error("Unknown prefix: " opcode)
			flags = add_flags(flags, "INAT_MAKE_PREFIX(" prefix_num[opcode] ")")
		}
		if (length(flags) == 0)
			continue
		# check if last prefix
		if (match(ext, lprefix1_expr)) {
			lptable1[idx] = add_flags(lptable1[idx],flags)
			variant = "INAT_VARIANT"
		} else if (match(ext, lprefix2_expr)) {
			lptable2[idx] = add_flags(lptable2[idx],flags)
			variant = "INAT_VARIANT"
		} else if (match(ext, lprefix3_expr)) {
			lptable3[idx] = add_flags(lptable3[idx],flags)
			variant = "INAT_VARIANT"
		} else {
			table[idx] = add_flags(table[idx],flags)
		}
	}
	if (variant)
		table[idx] = add_flags(table[idx],variant)
}

END {
	# print escape opcode map's array
	print "/* Escape opcode map array */"
	print "const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1]" \
	      "[INAT_LPREFIX_MAX + 1] = {"
	for (i = 0; i < geid; i++)
		for (j = 0; j < max_lprefix; j++)
			if (etable[i,j])
				print "	["i"]["j"] = "etable[i,j]","
	print "};\n"
	# print group opcode map's array
	print "/* Group opcode map array */"
	print "const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1]"\
	      "[INAT_LPREFIX_MAX + 1] = {"
	for (i = 0; i < ggid; i++)
		for (j = 0; j < max_lprefix; j++)
			if (gtable[i,j])
				print "	["i"]["j"] = "gtable[i,j]","
	print "};"
}

[-- Attachment #5: inat.c --]
[-- Type: text/plain, Size: 2218 bytes --]

/*
 * x86 instruction attribute tables
 *
 * Written by Masami Hiramatsu <mhiramat@redhat.com>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 */
#ifdef __KERNEL__
#include <linux/string.h>
#include <linux/module.h>
#include <asm/insn.h>
#else
#include "insn.h"
#include "inat.h"
#endif

/* Attribute tables are generated from opcode map */
#include "inat-tables.c"

/* Attribute search APIs */
insn_attr_t inat_get_opcode_attribute(u8 opcode)
{
	return inat_primary_table[opcode];
}

insn_attr_t inat_get_escape_attribute(u8 opcode, u8 last_pfx,
				      insn_attr_t esc_attr)
{
	const insn_attr_t *table;
	insn_attr_t lpfx_attr = inat_get_opcode_attribute(last_pfx);
	int n, m;

	n = INAT_ESCAPE_NUM(esc_attr);
	m = INAT_LPREFIX_NUM(lpfx_attr);
	table = inat_escape_tables[n][0];
	if (!table)
		return 0;
	if (INAT_HAS_VARIANT(table[opcode]) && m) {
		table = inat_escape_tables[n][m];
		if (!table)
			return 0;
	}
	return table[opcode];
}

#define REGBITS(modrm) (((modrm) >> 3) & 0x7)

insn_attr_t inat_get_group_attribute(u8 modrm, u8 last_pfx,
				     insn_attr_t grp_attr)
{
	const insn_attr_t *table;
	insn_attr_t lpfx_attr = inat_get_opcode_attribute(last_pfx);
	int n, m;

	n = INAT_GROUP_NUM(grp_attr);
	m = INAT_LPREFIX_NUM(lpfx_attr);
	table = inat_group_tables[n][0];
	if (!table)
		return INAT_GROUP_COMMON(grp_attr);
	if (INAT_HAS_VARIANT(table[REGBITS(modrm)]) && m) {
		table = inat_escape_tables[n][m];
		if (!table)
			return INAT_GROUP_COMMON(grp_attr);
	}
	return table[REGBITS(modrm)] | INAT_GROUP_COMMON(grp_attr);
}


[-- Attachment #6: inat.h --]
[-- Type: text/plain, Size: 4633 bytes --]

#ifndef _ASM_INAT_INAT_H
#define _ASM_INAT_INAT_H
/*
 * x86 instruction attributes
 *
 * Written by Masami Hiramatsu <mhiramat@redhat.com>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 */
#ifdef __KERNEL__
#include <linux/types.h>
#else
#include "insn_x86_user.h"
#endif

/* Instruction attributes */
typedef u32 insn_attr_t;

/*
 * Internal bits. Don't use bitmasks directly, because these bits are
 * unstable. You should add checking macros and use that macro in
 * your code.
 */

#define INAT_OPCODE_TABLE_SIZE 256
#define INAT_GROUP_TABLE_SIZE 8

/* Legacy instruction prefixes */
#define INAT_PFX_OPNDSZ	1	/* 0x66 */ /* LPFX1 */
#define INAT_PFX_REPNE	2	/* 0xF2 */ /* LPFX2 */
#define INAT_PFX_REPE	3	/* 0xF3 */ /* LPFX3 */
#define INAT_PFX_LOCK	4	/* 0xF0 */
#define INAT_PFX_CS	5	/* 0x2E */
#define INAT_PFX_DS	6	/* 0x3E */
#define INAT_PFX_ES	7	/* 0x26 */
#define INAT_PFX_FS	8	/* 0x64 */
#define INAT_PFX_GS	9	/* 0x65 */
#define INAT_PFX_SS	10	/* 0x36 */
#define INAT_PFX_ADDRSZ	11	/* 0x67 */

#define INAT_LPREFIX_MAX	3

/* Immediate size */
#define INAT_IMM_BYTE		1
#define INAT_IMM_WORD		2
#define INAT_IMM_DWORD		3
#define INAT_IMM_QWORD		4
#define INAT_IMM_PTR		5
#define INAT_IMM_VWORD32	6
#define INAT_IMM_VWORD		7

/* Legacy prefix */
#define INAT_PFX_OFFS	0
#define INAT_PFX_BITS	4
#define INAT_PFX_MAX    ((1 << INAT_PFX_BITS) - 1)
#define INAT_PFX_MASK	(INAT_PFX_MAX << INAT_PFX_OFFS)
/* Escape opcodes */
#define INAT_ESC_OFFS	(INAT_PFX_OFFS + INAT_PFX_BITS)
#define INAT_ESC_BITS	2
#define INAT_ESC_MAX	((1 << INAT_ESC_BITS) - 1)
#define INAT_ESC_MASK	(INAT_ESC_MAX << INAT_ESC_OFFS)
/* Group opcodes (1-16) */
#define INAT_GRP_OFFS	(INAT_ESC_OFFS + INAT_ESC_BITS)
#define INAT_GRP_BITS	5
#define INAT_GRP_MAX	((1 << INAT_GRP_BITS) - 1)
#define INAT_GRP_MASK	(INAT_GRP_MAX << INAT_GRP_OFFS)
/* Immediates */
#define INAT_IMM_OFFS	(INAT_GRP_OFFS + INAT_GRP_BITS)
#define INAT_IMM_BITS	3
#define INAT_IMM_MASK	(((1 << INAT_IMM_BITS) - 1) << INAT_IMM_OFFS)
/* Flags */
#define INAT_FLAG_OFFS	(INAT_IMM_OFFS + INAT_IMM_BITS)
#define INAT_REXPFX	(1 << INAT_FLAG_OFFS)
#define INAT_MODRM	(1 << (INAT_FLAG_OFFS + 1))
#define INAT_FORCE64	(1 << (INAT_FLAG_OFFS + 2))
#define INAT_ADDIMM	(1 << (INAT_FLAG_OFFS + 3))
#define INAT_MOFFSET	(1 << (INAT_FLAG_OFFS + 4))
#define INAT_VARIANT	(1 << (INAT_FLAG_OFFS + 5))

/* Attribute search APIs */
extern insn_attr_t inat_get_opcode_attribute(u8 opcode);
extern insn_attr_t inat_get_escape_attribute(u8 opcode, u8 last_pfx,
					     insn_attr_t esc_attr);
extern insn_attr_t inat_get_group_attribute(u8 modrm, u8 last_pfx,
					    insn_attr_t esc_attr);

/* Attribute checking macros. Use these macros in your code */
#define INAT_IS_PREFIX(attr)	(attr & INAT_PFX_MASK)
#define INAT_IS_ADDRSZ(attr)	((attr & INAT_PFX_MASK) == INAT_PFX_ADDRSZ)
#define INAT_IS_OPNDSZ(attr)	((attr & INAT_PFX_MASK) == INAT_PFX_OPNDSZ)
#define INAT_LPREFIX_NUM(attr)	\
	(((attr & INAT_PFX_MASK) > INAT_LPREFIX_MAX) ? 0 :\
	 (attr & INAT_PFX_MASK))
#define INAT_MAKE_PREFIX(pfx)	(pfx << INAT_PFX_OFFS)

#define INAT_IS_ESCAPE(attr)	(attr & INAT_ESC_MASK)
#define INAT_ESCAPE_NUM(attr)	((attr & INAT_ESC_MASK) >> INAT_ESC_OFFS)
#define INAT_MAKE_ESCAPE(esc)	(esc << INAT_ESC_OFFS)

#define INAT_IS_GROUP(attr)	(attr & INAT_GRP_MASK)
#define INAT_GROUP_NUM(attr)	((attr & INAT_GRP_MASK) >> INAT_GRP_OFFS)
#define INAT_GROUP_COMMON(attr)	(attr & ~INAT_GRP_MASK)
#define INAT_MAKE_GROUP(grp)	((grp << INAT_GRP_OFFS) | INAT_MODRM)

#define INAT_HAS_IMM(attr)	(attr & INAT_IMM_MASK)
#define INAT_IMM_SIZE(attr)	((attr & INAT_IMM_MASK) >> INAT_IMM_OFFS)
#define INAT_MAKE_IMM(imm)	(imm << INAT_IMM_OFFS)

#define INAT_IS_REX_PREFIX(attr)	(attr & INAT_REXPFX)
#define INAT_HAS_MODRM(attr)	(attr & INAT_MODRM)
#define INAT_IS_FORCE64(attr)	(attr & INAT_FORCE64)
#define INAT_HAS_ADDIMM(attr)	(attr & INAT_ADDIMM)
#define INAT_HAS_MOFFSET(attr)	(attr & INAT_MOFFSET)
#define INAT_HAS_VARIANT(attr)	(attr & INAT_VARIANT)

#endif

[-- Attachment #7: insn.c --]
[-- Type: text/plain, Size: 11601 bytes --]

/*
 * x86 instruction analysis
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2002, 2004, 2009
 */

#ifdef __KERNEL__
#include <linux/string.h>
#include <linux/module.h>
#include <asm/insn.h>
#include <asm/inat.h>
#else
#include <string.h>
#include "insn.h"
#endif

#define get_next(t, insn)	\
	({t r; r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; })

#define peek_next(t, insn)	\
	({t r; r = *(t*)insn->next_byte; r; })

/**
 * insn_init() - initialize struct insn
 * @insn:	&struct insn to be initialized
 * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
 * @x86_64:	true for 64-bit kernel or 64-bit app
 */
void insn_init(struct insn *insn, const u8 *kaddr, bool x86_64)
{
	memset(insn, 0, sizeof(*insn));
	insn->kaddr = kaddr;
	insn->next_byte = kaddr;
	insn->x86_64 = x86_64;
	insn->opnd_bytes = 4;
	if (x86_64)
		insn->addr_bytes = 8;
	else
		insn->addr_bytes = 4;
}
EXPORT_SYMBOL_GPL(insn_init);

/**
 * insn_get_prefixes - scan x86 instruction prefix bytes
 * @insn:	&struct insn containing instruction
 *
 * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
 * to point to the (first) opcode.  No effect if @insn->prefixes.got
 * is already true.
 */
void insn_get_prefixes(struct insn *insn)
{
	struct insn_field *prefixes = &insn->prefixes;
	insn_attr_t attr;
	u8 b;

	if (prefixes->got)
		return;

	prefixes->nbytes = 0;
	while (prefixes->nbytes < 4) {
		b = peek_next(u8, insn);
		attr = inat_get_opcode_attribute(b);
		if (!INAT_IS_PREFIX(attr))
			break;
		prefixes->bytes[prefixes->nbytes] = b;
		prefixes->nbytes++;
		insn->next_byte++;
		if (INAT_IS_ADDRSZ(attr)) {
			/* address size switches 2/4 or 4/8 */
			if (insn->x86_64)
				insn->addr_bytes ^= 12;
			else
				insn->addr_bytes ^= 6;
		} else if (INAT_IS_OPNDSZ(attr)) {
			/* oprand size switches 2/4 */
			insn->opnd_bytes ^= 6;
		}
	}
	if (insn->x86_64) {
		b = peek_next(u8, insn);
		attr = inat_get_opcode_attribute(b);
		if (INAT_IS_REX_PREFIX(attr)) {
			insn->rex_prefix.value = b;
			insn->rex_prefix.nbytes = 1;
			insn->rex_prefix.got = true;
			insn->next_byte++;
			if (REX_W(insn))
				/* REX.W overrides opnd_size */
				insn->opnd_bytes = 8;
		}
	}
	prefixes->got = true;
	return;
}
EXPORT_SYMBOL_GPL(insn_get_prefixes);

/**
 * insn_get_opcode - collect opcode(s)
 * @insn:	&struct insn containing instruction
 *
 * Populates @insn->opcode, updates @insn->next_byte to point past the
 * opcode byte(s), and set @insn->attr (except for groups).
 * If necessary, first collects any preceding (prefix) bytes.
 * Sets @insn->opcode.value = opcode1.  No effect if @insn->opcode.got
 * is already true.
 *
 */
void insn_get_opcode(struct insn *insn)
{
	struct insn_field *opcode = &insn->opcode;
	u8 op, pfx;
	if (opcode->got)
		return;
	if (!insn->prefixes.got)
		insn_get_prefixes(insn);

	/* Get first opcode */
	op = get_next(u8, insn);
	OPCODE1(insn) = op;
	opcode->nbytes = 1;
	insn->attr = inat_get_opcode_attribute(op);
	while (INAT_IS_ESCAPE(insn->attr)) {
		/* Get escaped opcode */
		op = get_next(u8, insn);
		opcode->bytes[opcode->nbytes++] = op;
		pfx = insn_last_prefix(insn);
		insn->attr = inat_get_escape_attribute(op, pfx, insn->attr);
	}
	opcode->got = true;
}
EXPORT_SYMBOL_GPL(insn_get_opcode);

/**
 * insn_get_modrm - collect ModRM byte, if any
 * @insn:	&struct insn containing instruction
 *
 * Populates @insn->modrm and updates @insn->next_byte to point past the
 * ModRM byte, if any.  If necessary, first collects the preceding bytes
 * (prefixes and opcode(s)).  No effect if @insn->modrm.got is already true.
 */
void insn_get_modrm(struct insn *insn)
{
	struct insn_field *modrm = &insn->modrm;
	u8 pfx, mod;
	if (modrm->got)
		return;
	if (!insn->opcode.got)
		insn_get_opcode(insn);

	if (INAT_HAS_MODRM(insn->attr)) {
		mod = get_next(u8, insn);
		modrm->value = mod;
		modrm->nbytes = 1;
		if (INAT_IS_GROUP(insn->attr)) {
			pfx = insn_last_prefix(insn);
			insn->attr = inat_get_group_attribute(mod, pfx,
							      insn->attr);
		}
	}

	if (insn->x86_64 && INAT_IS_FORCE64(insn->attr))
		insn->opnd_bytes = 8;
	modrm->got = true;
}
EXPORT_SYMBOL_GPL(insn_get_modrm);


/**
 * insn_rip_relative() - Does instruction use RIP-relative addressing mode?
 * @insn:	&struct insn containing instruction
 *
 * If necessary, first collects the instruction up to and including the
 * ModRM byte.  No effect if @insn->x86_64 is false.
 */
bool insn_rip_relative(struct insn *insn)
{
	struct insn_field *modrm = &insn->modrm;

	if (!insn->x86_64)
		return false;
	if (!modrm->got)
		insn_get_modrm(insn);
	/*
	 * For rip-relative instructions, the mod field (top 2 bits)
	 * is zero and the r/m field (bottom 3 bits) is 0x5.
	 */
	return (insn_field_exists(modrm) && (modrm->value & 0xc7) == 0x5);
}
EXPORT_SYMBOL_GPL(insn_rip_relative);

/**
 *
 * insn_get_sib() - Get the SIB byte of instruction
 * @insn:	&struct insn containing instruction
 *
 * If necessary, first collects the instruction up to and including the
 * ModRM byte.
 */
void insn_get_sib(struct insn *insn)
{
	if (insn->sib.got)
		return;
	if (!insn->modrm.got)
		insn_get_modrm(insn);
	if (insn->modrm.nbytes)
		if (insn->addr_bytes != 2 &&
		    MODRM_MOD(insn) != 3 && MODRM_RM(insn) == 4) {
			insn->sib.value = get_next(u8, insn);
			insn->sib.nbytes = 1;
		}
	insn->sib.got = true;
}
EXPORT_SYMBOL_GPL(insn_get_sib);


/**
 *
 * insn_get_displacement() - Get the displacement of instruction
 * @insn:	&struct insn containing instruction
 *
 * If necessary, first collects the instruction up to and including the
 * SIB byte.
 * Displacement value is sign-expanded.
 */
void insn_get_displacement(struct insn *insn)
{
	u8 mod;
	if (insn->displacement.got)
		return;
	if (!insn->sib.got)
		insn_get_sib(insn);
	if (insn->modrm.nbytes) {
		/*
		 * Interpreting the modrm byte:
		 * mod = 00 - no displacement fields (exceptions below)
		 * mod = 01 - 1-byte displacement field
		 * mod = 10 - displacement field is 4 bytes, or 2 bytes if
		 * 	address size = 2 (0x67 prefix in 32-bit mode)
		 * mod = 11 - no memory operand
		 *
		 * If address size = 2...
		 * mod = 00, r/m = 110 - displacement field is 2 bytes
		 *
		 * If address size != 2...
		 * mod != 11, r/m = 100 - SIB byte exists
		 * mod = 00, SIB base = 101 - displacement field is 4 bytes
		 * mod = 00, r/m = 101 - rip-relative addressing, displacement
		 * 	field is 4 bytes
		 */
		mod = MODRM_MOD(insn);
		if (mod == 3)
			goto out;
		if (mod == 1) {
			insn->displacement.value = get_next(s8, insn);
			insn->displacement.nbytes = 1;
		} else if (insn->addr_bytes == 2) {
			if ((mod == 0 && MODRM_RM(insn) == 6) || mod == 2) {
				insn->displacement.value = get_next(s16, insn);
				insn->displacement.nbytes = 2;
			}
		} else {
			if ((mod == 0 && MODRM_RM(insn) == 5) || mod == 2 ||
			    (mod == 0 && SIB_BASE(insn) == 5)) {
				insn->displacement.value = get_next(s32, insn);
				insn->displacement.nbytes = 4;
			}
		}
	}
out:
	insn->displacement.got = true;
}
EXPORT_SYMBOL_GPL(insn_get_displacement);

/* Decode moffset16/32/64 */
static void __get_moffset(struct insn *insn)
{
	switch (insn->addr_bytes) {
	case 2:
		insn->moffset1.value = get_next(s16, insn);
		insn->moffset1.nbytes = 2;
		break;
	case 4:
		insn->moffset1.value = get_next(s32, insn);
		insn->moffset1.nbytes = 4;
		break;
	case 8:
		insn->moffset1.value = get_next(s32, insn);
		insn->moffset1.nbytes = 4;
		insn->moffset2.value = get_next(s32, insn);
		insn->moffset2.nbytes = 4;
		break;
	}
	insn->moffset1.got = insn->moffset2.got = true;
}

/* Decode imm v32(Iz) */
static void __get_immv32(struct insn *insn)
{
	switch (insn->opnd_bytes) {
	case 2:
		insn->immediate.value = get_next(s16, insn);
		insn->immediate.nbytes = 2;
		break;
	case 4:
	case 8:
		insn->immediate.value = get_next(s32, insn);
		insn->immediate.nbytes = 4;
		break;
	}
}

/* Decode imm v64(Iv/Ov) */
static void __get_immv(struct insn *insn)
{
	switch (insn->opnd_bytes) {
	case 2:
		insn->immediate1.value = get_next(s16, insn);
		insn->immediate1.nbytes = 2;
		break;
	case 4:
		insn->immediate1.value = get_next(s32, insn);
		insn->immediate1.nbytes = 4;
		break;
	case 8:
		insn->immediate1.value = get_next(s32, insn);
		insn->immediate1.nbytes = 4;
		insn->immediate2.value = get_next(s32, insn);
		insn->immediate2.nbytes = 4;
		break;
	}
	insn->immediate1.got = insn->immediate2.got = true;
}

/* Decode ptr16:16/32(Ap) */
static void __get_immptr(struct insn *insn)
{
	switch (insn->opnd_bytes) {
	case 2:
		insn->immediate1.value = get_next(s16, insn);
		insn->immediate1.nbytes = 2;
		break;
	case 4:
		insn->immediate1.value = get_next(s32, insn);
		insn->immediate1.nbytes = 4;
		break;
	case 8:
		/* ptr16:64 is not supported (no segment) */
		WARN_ON(1);
		return;
	}
	insn->immediate2.value = get_next(u16, insn);
	insn->immediate2.nbytes = 2;
	insn->immediate1.got = insn->immediate2.got = true;
}

/**
 *
 * insn_get_immediate() - Get the immediates of instruction
 * @insn:	&struct insn containing instruction
 *
 * If necessary, first collects the instruction up to and including the
 * displacement bytes.
 * Basically, most of immediates are sign-expanded. Unsigned-value can be
 * get by bit masking with ((1 << (nbytes * 8)) - 1)
 */
void insn_get_immediate(struct insn *insn)
{
	if (insn->immediate.got)
		return;
	if (!insn->displacement.got)
		insn_get_displacement(insn);

	if (INAT_HAS_MOFFSET(insn->attr)) {
		__get_moffset(insn);
		goto done;
	}

	if (!INAT_HAS_IMM(insn->attr))
		/* no immediates */
		goto done;

	switch (INAT_IMM_SIZE(insn->attr)) {
	case INAT_IMM_BYTE:
		insn->immediate.value = get_next(s8, insn);
		insn->immediate.nbytes = 1;
		break;
	case INAT_IMM_WORD:
		insn->immediate.value = get_next(s16, insn);
		insn->immediate.nbytes = 2;
		break;
	case INAT_IMM_DWORD:
		insn->immediate.value = get_next(s32, insn);
		insn->immediate.nbytes = 4;
		break;
	case INAT_IMM_QWORD:
		insn->immediate1.value = get_next(s32, insn);
		insn->immediate1.nbytes = 4;
		insn->immediate2.value = get_next(s32, insn);
		insn->immediate2.nbytes = 4;
		break;
	case INAT_IMM_PTR:
		__get_immptr(insn);
		break;
	case INAT_IMM_VWORD32:
		__get_immv32(insn);
		break;
	case INAT_IMM_VWORD:
		__get_immv(insn);
		break;
	default:
		break;
	}
	if (INAT_HAS_ADDIMM(insn->attr)) {
		insn->immediate2.value = get_next(s8, insn);
		insn->immediate2.nbytes = 1;
	}
done:
	insn->immediate.got = true;
}
EXPORT_SYMBOL_GPL(insn_get_immediate);

/**
 *
 * insn_get_length() - Get the length of instruction
 * @insn:	&struct insn containing instruction
 *
 * If necessary, first collects the instruction up to and including the
 * immediates bytes.
 */
void insn_get_length(struct insn *insn)
{
	if (insn->length)
		return;
	if (!insn->immediate.got)
		insn_get_immediate(insn);
	insn->length = (u8)((unsigned long)insn->next_byte
			    - (unsigned long)insn->kaddr);
}
EXPORT_SYMBOL_GPL(insn_get_length);

[-- Attachment #8: insn.h --]
[-- Type: text/plain, Size: 4083 bytes --]

#ifndef _ASM_X86_INSN_H
#define _ASM_X86_INSN_H
/*
 * x86 instruction analysis
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2009
 */

#ifdef __KERNEL__
#include <linux/types.h>
/* insn_attr_t is defined in inat.h */
#include <asm/inat.h>
#else
#include "insn_x86_user.h"
#include "inat.h"
#endif

struct insn_field {
	union {
		s32 value;
		u8 bytes[4];
	};
	bool got;	/* true if we've run insn_get_xxx() for this field */
	u8 nbytes;
};

struct insn {
	struct insn_field prefixes;	/* 4 prefixes */
	struct insn_field rex_prefix;	/* REX prefix */
	struct insn_field opcode;	/*
					 * opcode.bytes[0]: opcode1
					 * opcode.bytes[1]: opcode2
					 * opcode.bytes[2]: opcode3
					 */
	struct insn_field modrm;
	struct insn_field sib;
	struct insn_field displacement;
	union {
		struct insn_field immediate;
		struct insn_field moffset1;	/* for 64bit MOV */
		struct insn_field immediate1;	/* for 64bit imm or off16/32 */
	};
	union {
		struct insn_field moffset2;	/* for 64bit MOV */
		struct insn_field immediate2;	/* for 64bit imm or seg16 */
	};

	insn_attr_t attr;
	u8 opnd_bytes;
	u8 addr_bytes;
	u8 length;
	bool x86_64;

	const u8 *kaddr;	/* kernel address of insn (copy) to analyze */
	const u8 *next_byte;
};


#define OPCODE1(insn) ((insn)->opcode.bytes[0])
#define OPCODE2(insn) ((insn)->opcode.bytes[1])
#define OPCODE3(insn) ((insn)->opcode.bytes[2])

#define MODRM_MOD(insn) (((insn)->modrm.value & 0xc0) >> 6)
#define MODRM_REG(insn) (((insn)->modrm.value & 0x38) >> 3)
#define MODRM_RM(insn) ((insn)->modrm.value & 0x07)

#define SIB_SCALE(insn) (((insn)->sib.value & 0xc0) >> 6)
#define SIB_INDEX(insn) (((insn)->sib.value & 0x38) >> 3)
#define SIB_BASE(insn) ((insn)->sib.value & 0x07)

#define REX_W(insn) ((insn)->rex_prefix.value & 8)
#define REX_R(insn) ((insn)->rex_prefix.value & 4)
#define REX_X(insn) ((insn)->rex_prefix.value & 2)
#define REX_B(insn) ((insn)->rex_prefix.value & 1)

#define MOFFSET64(insn)	(((u64)((insn)->moffset2.value) << 32) | \
			  (u32)((insn)->moffset1.value))

#define IMMEDIATE64(insn)	(((u64)((insn)->immediate2.value) << 32) | \
				  (u32)((insn)->immediate1.value))

extern void insn_init(struct insn *insn, const u8 *kaddr, bool x86_64);
extern void insn_get_prefixes(struct insn *insn);
extern void insn_get_opcode(struct insn *insn);
extern void insn_get_modrm(struct insn *insn);
extern void insn_get_sib(struct insn *insn);
extern void insn_get_displacement(struct insn *insn);
extern void insn_get_immediate(struct insn *insn);
extern void insn_get_length(struct insn *insn);

/* Attribute will be determined after getting ModRM (for opcode groups) */
static inline void insn_get_attr(struct insn *insn)
{
	insn_get_modrm(insn);
}

/* The last prefix is needed for two-byte and three-byte opcodes */
static inline u8 insn_last_prefix(struct insn *insn)
{
	if (insn->prefixes.nbytes == 0)
		return 0;
	return (insn)->prefixes.bytes[(insn)->prefixes.nbytes - 1];
}

/* Instruction uses RIP-relative addressing */
extern bool insn_rip_relative(struct insn *insn);

#ifdef CONFIG_X86_64
/* Init insn for kernel text */
#define insn_init_kernel(insn, kaddr) insn_init(insn, kaddr, 1)
#else /* CONFIG_X86_32 */
#define insn_init_kernel(insn, kaddr) insn_init(insn, kaddr, 0)
#endif

static inline bool insn_field_exists(const struct insn_field *field)
{
	return (field->nbytes > 0);
}

#endif /* _ASM_X86_INSN_H */

[-- Attachment #9: insn_x86_user.h --]
[-- Type: text/plain, Size: 1595 bytes --]

#ifndef __INSN_X86_USER_H
#define __INSN_X86_USER_H

/*
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
 *
 * Copyright (C) IBM Corporation, 2009
 */

#ifdef __x86_64__
#define CONFIG_X86_64
#else
#define CONFIG_X86_32
#endif
typedef unsigned char u8;
typedef unsigned short u16;
typedef unsigned int u32;
typedef unsigned long long u64;

typedef signed char s8;
typedef short s16;
typedef int s32;
typedef long long s64;

typedef enum bool { false, true } bool;

/* any harmless file-scope decl */
#define NOP_DECL struct __nop
#define EXPORT_SYMBOL_GPL(symbol) NOP_DECL
#define MODULE_LICENSE(gpl) NOP_DECL

#define WARN_ON(cond) do{}while(0)

#define BITS_PER_LONG (8*sizeof(long))
/* from arch/x86/include/asm/bitops.h */
static inline int test_bit(int nr, const volatile unsigned long *addr)
{
	return ((1UL << (nr % BITS_PER_LONG)) &
		(((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0;
}

#endif /* __INSN_X86_USER_H */

[-- Attachment #10: test_get_len.c --]
[-- Type: text/plain, Size: 1907 bytes --]

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include "insn.h"

/*
 * Test of instruction analysis in general and insn_get_length() in
 * particular.  See if insn_get_length() and the disassembler agree
 * on the length of each instruction in an elf disassembly.
 *
 * usage: test_get_len [x86_64] < distilled_disassembly
 */

const char *prog;

static void usage()
{
	fprintf(stderr, "usage: %s [x86_64] < distilled_disassembly\n", prog);
	exit(1);
}

static void malformed_line(const char *line, int line_nr)
{
	fprintf(stderr, "%s: malformed line %d:\n%s", prog, line_nr, line);
	exit(3);
}

int main(int argc, char **argv)
{
	char line[200];
	unsigned char insn_buf[16];
	struct insn insn;
	bool x86_64 = false;
	int errors = 0, insns = 0;
#define MAX_ERRORS 10

	prog = argv[0];
	if (argc == 2) {
		if (!strcmp(argv[1], "x86_64"))
			x86_64 = true;
		else
			usage();
	} else if (argc > 2)
		usage();

	while (fgets(line, 200, stdin)) {
		char copy[200], *s, *tab1, *tab2;
		int nb = 0;
		unsigned b;

		insns++;
		memset(insn_buf, 0, 16);
		strcpy(copy, line);
		tab1 = strchr(copy, '\t');
		if (!tab1)
			malformed_line(line, insns);
		s = tab1 + 1;
		s += strspn(s, " ");
		tab2 = strchr(s, '\t');
		if (!tab2)
			malformed_line(line, insns);
		*tab2 = '\0';  // so characters beyond tab2 aren't examined
		while (s < tab2) {
			if (sscanf(s, "%x", &b) == 1) {
				insn_buf[nb++] = (unsigned char) b;
				s += 3;
			} else
				break;
		}
		
		insn_init(&insn, insn_buf, x86_64);
		insn_get_length(&insn);
		if (insn.length != nb) {
			fprintf(stderr, "%s", line);
			fprintf(stderr, "objdump says %d bytes, but "
				"insn_get_length() says %d (attr:%x)\n", nb,
				insn.length, insn.attr);
			if (++errors > MAX_ERRORS) {
				fprintf(stderr, "Stopping after %d errors "
					"and %d instructions.\n",
					MAX_ERRORS, insns);
				exit(2);
			}
		}
	}
	return 0;
}

[-- Attachment #11: x86-opcode-map.txt --]
[-- Type: text/plain, Size: 9933 bytes --]

# x86 Opcode Maps
#
#<Opcode maps>
# Table: table-name
# Referrer: escaped-name
# opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
# (or)
# opcode: escape # escaped-name
# EndTable
#
#<group maps>
# GrpTable: GrpXXX
# reg:  mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
# EndTable
#

Table: one byte opcode
Referrer:
# 0x00 - 0x0f
00: ADD Eb,Gb
01: ADD Ev,Gv
02: ADD Gb,Eb
03: ADD Gv,Ev
04: ADD AL,Ib
05: ADD rAX,Iz
06: PUSH ES (i64)
07: POP ES (i64)
08: OR Eb,Gb
09: OR Ev,Gv
0a: OR Gb,Eb
0b: OR Gv,Ev
0c: OR AL,Ib
0d: OR rAX,Iz
0e: PUSH CS (i64)
0f: escape # 2-byte escape
# 0x10 - 0x1f
10: ADC Eb,Gb
11: ADC Ev,Gv
12: ADC Gb,Eb
13: ADC Gv,Ev
14: ADC AL,Ib
15: ADC rAX,Iz
16: PUSH SS (i64)
17: POP SS (i64)
18: SBB Eb,Gb
19: SBB Ev,Gv
1a: SBB Gb,Eb
1b: SBB Gv,Ev
1c: SBB AL,Ib
1d: SBB rAX,Iz
1e: PUSH DS (i64)
1f: POP DS (i64)
# 0x20 - 0x2f
20: AND Eb,Gb
21: AND Ev,Gv
22: AND Gb,Eb
23: AND Gv,Ev
24: AND AL,Ib
25: AND rAx,Iz
26: SEG=ES (Prefix)
27: DAA (i64)
28: SUB Eb,Gb
29: SUB Ev,Gv
2a: SUB Gb,Eb
2b: SUB Gv,Ev
2c: SUB AL,Ib
2d: SUB rAX,Iz
2e: SEG=CS (Prefix)
2f: DAS (i64)
# 0x30 - 0x3f
30: XOR Eb,Gb
31: XOR Ev,Gv
32: XOR Gb,Eb
33: XOR Gv,Ev
34: XOR AL,Ib
35: XOR rAX,Iz
36: SEG=SS (Prefix)
37: AAA (i64)
38: CMP Eb,Gb
39: CMP Ev,Gv
3a: CMP Gb,Eb
3b: CMP Gv,Ev
3c: CMP AL,Ib
3d: CMP rAX,Iz
3e: SEG=DS (Prefix)
3f: AAS (i64)
# 0x40 - 0x4f
40: INC eAX (i64) | REX (o64)
41: INC eCX (i64) | REX.B (o64)
42: INC eDX (i64) | REX.X (o64)
43: INC eBX (i64) | REX.XB (o64)
44: INC eSP (i64) | REX.R (o64)
45: INC eBP (i64) | REX.RB (o64)
46: INC eSI (i64) | REX.RX (o64)
47: INC eDI (i64) | REX.RXB (o64)
48: DEC eAX (i64) | REX.W (o64)
49: DEC eCX (i64) | REX.WB (o64)
4a: DEC eDX (i64) | REX.WX (o64)
4b: DEC eBX (i64) | REX.WXB (o64)
4c: DEC eSP (i64) | REX.WR (o64)
4d: DEC eBP (i64) | REX.WRB (o64)
4e: DEC eSI (i64) | REX.WRX (o64)
4f: DEC eDI (i64) | REX.WRXB (o64)
# 0x50 - 0x5f
50: PUSH rAX/r8 (d64)
51: PUSH rCX/r9 (d64)
52: PUSH rDX/r10 (d64)
53: PUSH rBX/r11 (d64)
54: PUSH rSP/r12 (d64)
55: PUSH rBP/r13 (d64)
56: PUSH rSI/r14 (d64)
57: PUSH rDI/r15 (d64)
58: POP rAX/r8 (d64)
59: POP rCX/r9 (d64)
5a: POP rDX/r10 (d64)
5b: POP rBX/r11 (d64)
5c: POP rSP/r12 (d64)
5d: POP rBP/r13 (d64)
5e: POP rSI/r14 (d64)
5f: POP rDI/r15 (d64)
# 0x60 - 0x6f
60: PUSHA/PUSHAD (i64)
61: POPA/POPAD (i64)
62: BOUND Gv,Ma (i64)
63: ARPL Ew,Gw (i64) | MOVSXD Gv,Ev (o64)
64: SEG=FS (Prefix)
65: SEG=GS (Prefix)
66: Operand-Size (Prefix)
67: Address-Size (Prefix)
68: PUSH Iz (d64)
69: IMUL Gv,Ev,Iz
6a: PUSH Ib (d64)
6b: IMUL Gv,Ev,Ib
6c: INS/INSB Yb,DX
6d: INS/INSW/INSD Yz,DX
6e: OUTS/OUTSB DX,Xb
6f: OUTS/OUTSW/OUTSD DX,Xz
# 0x70 - 0x7f
70: JO Jb
71: JNO Jb
72: JB/JNAE/JC Jb
73: JNB/JAE/JNC Jb
74: JZ/JE Jb
75: JNZ/JNE Jb
76: JBE/JNA Jb
77: JNBE/JA Jb
78: JS Jb
79: JNS Jb
7a: JP/JPE Jb
7b: JNP/JPO Jb
7c: JL/JNGE Jb
7d: JNL/JGE Jb
7e: JLE/JNG Jb
7f: JNLE/JG Jb
# 0x80 - 0x8f
80: Grp1 Eb,Ib (1A)
81: Grp1 Ev,Iz (1A)
82: Grp1 Eb,Ib (1A),(i64)
83: Grp1 Ev,Ib (1A)
84: TEST Eb,Gb
85: TEST Ev,Gv
86: XCHG Eb,Gb
87: XCHG Ev,Gv
88: MOV Eb,Gb
89: MOV Ev,Gv
8a: MOV Gb,Eb
8b: MOV Gv,Ev
8c: MOV Ev,Sw
8d: LEA Gv,M
8e: MOV Sw,Ew
8f: Grp1A (1A) | POP Ev (d64)
# 0x90 - 0x9f
90: NOP | PAUSE (F3) | XCHG r8,rAX
91: XCHG rCX/r9,rAX
92: XCHG rDX/r10,rAX
93: XCHG rBX/r11,rAX
94: XCHG rSP/r12,rAX
95: XCHG rBP/r13,rAX
96: XCHG rSI/r14,rAX
97: XCHG rDI/r15,rAX
98: CBW/CWDE/CDQE
99: CWD/CDQ/CQO
9a: CALLF Ap (i64)
9b: FWAIT/WAIT
9c: PUSHF/D/Q Fv (d64)
9d: POPF/D/Q Fv (d64)
9e: SAHF
9f: LAHF
# 0xa0 - 0xaf
a0: MOV AL,Ob
a1: MOV rAX,Ov
a2: MOV Ob,AL
a3: MOV Ov,rAX
a4: MOVS/B Xb,Yb
a5: MOVS/W/D/Q Xv,Yv
a6: CMPS/B Xb,Yb
a7: CMPS/W/D Xv,Yv
a8: TEST AL,Ib
a9: TEST rAX,Iz
aa: STOS/B Yb,AL
ab: STOS/W/D/Q Yv,rAX
ac: LODS/B AL,Xb
ad: LODS/W/D/Q rAX,Xv
ae: SCAS/B AL,Yb
af: SCAS/W/D/Q rAX,Xv
# 0xb0 - 0xbf
b0: MOV AL/R8L,Ib
b1: MOV CL/R9L,Ib
b2: MOV DL/R10L,Ib
b3: MOV BL/R11L,Ib
b4: MOV AH/R12L,Ib
b5: MOV CH/R13L,Ib
b6: MOV DH/R14L,Ib
b7: MOV BH/R15L,Ib
b8: MOV rAX/r8,Iv
b9: MOV rCX/r9,Iv
ba: MOV rDX/r10,Iv
bb: MOV rBX/r11,Iv
bc: MOV rSP/r12,Iv
bd: MOV rBP/r13,Iv
be: MOV rSI/r14,Iv
bf: MOV rDI/r15,Iv
# 0xc0 - 0xcf
c0: Grp2 Eb,Ib (1A)
c1: Grp2 Ev,Ib (1A)
c2: RETN Iw (f64)
c3: RETN
c4: LES Gz,Mp (i64)
c5: LDS Gz,Mp (i64)
c6: Grp11 Eb,Ib (1A)
c7: Grp11 Ev,Iz (1A)
c8: ENTER Iw,Ib
c9: LEAVE (d64)
ca: RETF Iw
cb: RETF
cc: INT3
cd: INT Ib
ce: INTO (i64)
cf: IRET/D/Q
# 0xd0 - 0xdf
d0: Grp2 Eb,1 (1A)
d1: Grp2 Ev,1 (1A)
d2: Grp2 Eb,CL (1A)
d3: Grp2 Ev,CL (1A)
d4: AAM Ib (i64)
d5: AAD Ib (i64)
d6:
d7: XLAT/XLATB
d8: ESC
d9: ESC
da: ESC
db: ESC
dc: ESC
dd: ESC
de: ESC
df: ESC
# 0xe0 - 0xef
e0: LOOPNE/LOOPNZ Jb (f64)
e1: LOOPE/LOOPZ Jb (f64)
e2: LOOP Jb (f64)
e3: JrCXZ Jb (f64)
e4: IN AL,Ib
e5: IN eAX,Ib
e6: OUT Ib,AL
e7: OUT Ib,eAX
e8: CALL Jz (f64)
e9: JMP-near Jz (f64)
ea: JMP-far Ap (i64)
eb: JMP-short Jb (f64)
ec: IN AL,DX
ed: IN eAX,DX
ee: OUT DX,AL
ef: OUT DX,eAX
# 0xf0 - 0xff
f0: LOCK (Prefix)
f1:
f2: REPNE (Prefix)
f3: REP/REPE (Prefix)
f4: HLT
f5: CMC
f6: Grp3_1 Eb (1A)
f7: Grp3_2 Ev (1A)
f8: CLC
f9: STC
fa: CLI
fb: STI
fc: CLD
fd: STD
fe: Grp4 (1A)
ff: Grp5 (1A)
EndTable

Table: 2-byte opcode # First Byte is 0x0f
Referrer: 2-byte escape
# 0x0f 0x00-0x0f
00: Grp6 (1A)
01: Grp7 (1A)
02: LAR Gv,Ew
03: LSL Gv,Ew
04:
05: SYSCALL (o64)
06: CLTS
07: SYSRET (o64)
08: INVD
09: WBINVD
0a:
0b: UD2 (1B)
0c:
0d: NOP Ev
0e:
0f:
# 0x0f 0x10-0x1f
10:
11:
12:
13:
14:
15:
16:
17:
18: Grp16 (1A)
19:
1a:
1b:
1c:
1d:
1e:
1f: NOP Ev
# 0x0f 0x20-0x2f
20: MOV Rd,Cd
21: MOV Rd,Dd
22: MOV Cd,Rd
23: MOV Dd,Rd
24:
25:
26:
27:
28:
29:
2a:
2b:
2c:
2d:
2e:
2f:
# 0x0f 0x30-0x3f
30: WRMSR
31: RDTSC
32: RDMSR
33: RDPMC
34: SYSENTER
35: SYSEXIT
36:
37: GETSEC
38: escape # 3-byte escape 1
39:
3a: escape # 3-byte escape 2
3b:
3c:
3d:
3e:
3f:
# 0x0f 0x40-0x4f
40: CMOVO Gv,Ev
41: CMOVNO Gv,Ev
42: CMOVB/C/NAE Gv,Ev
43: CMOVAE/NB/NC Gv,Ev
44: CMOVE/Z Gv,Ev
45: CMOVNE/NZ Gv,Ev
46: CMOVBE/NA Gv,Ev
47: CMOVA/NBE Gv,Ev
48: CMOVS Gv,Ev
49: CMOVNS Gv,Ev
4a: CMOVP/PE Gv,Ev
4b: CMOVNP/PO Gv,Ev
4c: CMOVL/NGE Gv,Ev
4d: CMOVNL/GE Gv,Ev
4e: CMOVLE/NG Gv,Ev
4f: CMOVNLE/G Gv,Ev
# 0x0f 0x50-0x5f
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
5a:
5b:
5c:
5d:
5e:
5f:
# 0x0f 0x60-0x6f
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
6a:
6b:
6c:
6d:
6e:
6f:
# 0x0f 0x70-0x7f
70:
71: Grp12 (1A)
72: Grp13 (1A)
73: Grp14 (1A)
74:
75:
76:
77:
78: VMREAD Ed/q,Gd/q
79: VMWRITE Gd/q,Ed/q
7a:
7b:
7c:
7d:
7e:
7f:
# 0x0f 0x80-0x8f
80: JO Jz (f64)
81: JNO Jz (f64)
82: JB/JNAE/JC Jz (f64)
83: JNB/JAE/JNC Jz (f64)
84: JZ/JE Jz (f64)
85: JNZ/JNE Jz (f64)
86: JBE/JNA Jz (f64)
87: JNBE/JA Jz (f64)
88: JS Jz (f64)
89: JNS Jz (f64)
8a: JP/JPE Jz (f64)
8b: JNP/JPO Jz (f64)
8c: JL/JNGE Jz (f64)
8d: JNL/JGE Jz (f64)
8e: JLE/JNG Jz (f64)
8f: JNLE/JG Jz (f64)
# 0x0f 0x90-0x9f
90: SETO Eb
91: SETNO Eb
92: SETB/C/NAE Eb
93: SETAE/NB/NC Eb
94: SETE/Z Eb
95: SETNE/NZ Eb
96: SETBE/NA Eb
97: SETA/NBE Eb
98: SETS Eb
99: SETNS Eb
9a: SETP/PE Eb
9b: SETNP/PO Eb
9c: SETL/NGE Eb
9d: SETNL/GE Eb
9e: SETLE/NG Eb
9f: SETNLE/G Eb
# 0x0f 0xa0-0xaf
a0: PUSH FS (d64)
a1: POP FS (d64)
a2: CPUID
a3: BT Ev,Gv
a4: SHLD Ev,Gv,Ib
a5: SHLD Ev,Gv,CL
a6:
a7:
a8: PUSH GS (d64)
a9: POP GS (d64)
aa: RSM
ab: BTS Ev,Gv
ac: SHRD Ev,Gv,Ib
ad: SHRD Ev,Gv,CL
ae: Grp15 (1A),(1C)
af: IMUL Gv,Ev
# 0x0f 0xb0-0xbf
b0: CMPXCHG Eb,Gb
b1: CMPXCHG Ev,Gv
b2: LSS Gv,Mp
b3: BTR Ev,Gv
b4: LFS Gv,Mp
b5: LGS Gv,Mp
b6: MOVZX Gv,Eb
b7: MOVZX Gv,Ew
b8: JMPE | POPCNT Gv,Ev (F3)
b9: Grp10 (1A)
ba: Grp8 Ev,Ib (1A)
bb: BTC Ev,Gv
bc: BSF Gv,Ev
bd: BSR Gv,Ev
be: MOVSX Gv,Eb
bf: MOVSX Gv,Ew
# 0x0f 0xc0-0xcf
c0: XADD Eb,Gb
c1: XADD Ev,Gv
c2:
c3: movnti Md/q,Gd/q
c4:
c5:
c6:
c7: Grp9 (1A)
c8: BSWAP RAX/EAX/R8/R8D
c9: BSWAP RCX/ECX/R9/R9D
ca: BSWAP RDX/EDX/R10/R10D
cb: BSWAP RBX/EBX/R11/R11D
cc: BSWAP RSP/ESP/R12/R12D
cd: BSWAP RBP/EBP/R13/R13D
ce: BSWAP RSI/ESI/R14/R14D
cf: BSWAP RDI/EDI/R15/R15D
# 0x0f 0xd0-0xdf
d0:
d1:
d2:
d3:
d4:
d5:
d6:
d7:
d8:
d9:
da:
db:
dc:
dd:
de:
df:
# 0x0f 0xe0-0xef
e0:
e1:
e2:
e3:
e4:
e5:
e6:
e7:
e8:
e9:
ea:
eb:
ec:
ed:
ee:
ef:
# 0x0f 0xf0-0xff
f0:
f1:
f2:
f3:
f4:
f5:
f6:
f7:
f8:
f9:
fa:
fb:
fc:
fd:
fe:
ff:
EndTable

Table: 3-byte opcode 1
Referrer: 3-byte escape 1
80: INVEPT Gd/q,Mdq (66)
81: INVPID Gd/q,Mdq (66)
f0: MOVBE Gv,Mv | CRC32 Gd,Eb (F2)
f1: MOVBE Mv,Gv | CRC32 Gd,Ev (F2)
EndTable

Table: 3-byte opcode 2
Referrer: 3-byte escape 2
# all opcode is for SSE
EndTable

GrpTable: Grp1
0: ADD
1: OR
2: ADC
3: SBB
4: AND
5: SUB
6: XOR
7: CMP
EndTable

GrpTable: Grp1A
0: POP
EndTable

GrpTable: Grp2
0: ROL
1: ROR
2: RCL
3: RCR
4: SHL/SAL
5: SHR
6:
7: SAR
EndTable

GrpTable: Grp3_1
0: TEST Eb,Ib
1:
2: NOT Eb
3: NEG Eb
4: MUL AL,Eb
5: IMUL AL,Eb
6: DIV AL,Eb
7: IDIV AL,Eb
EndTable

GrpTable: Grp3_2
0: TEST Ev,Iz
1:
2: NOT Ev
3: NEG Ev
4: MUL rAX,Ev
5: IMUL rAX,Ev
6: DIV rAX,Ev
7: IDIV rAX,Ev
EndTable

GrpTable: Grp4
0: INC Eb
1: DEC Eb
EndTable

GrpTable: Grp5
0: INC Ev
1: DEC Ev
2: CALLN Ev (f64)
3: CALLF Ep
4: JMPN Ev (f64)
5: JMPF Ep
6: PUSH Ev (d64)
7:
EndTable

GrpTable: Grp6
0: SLDT Rv/Mw
1: STR Rv/Mw
2: LLDT Ew
3: LTR Ew
4: VERR Ew
5: VERW Ew
EndTable

GrpTable: Grp7
0: SGDT Ms | VMCALL (11B),(001) | VMLAUNCH (11B),(010) | VMRESUME (011),(11B) | VMXOFF (100),(11B)
1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001)
2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B)
3: LIDT Ms
4: SMSW Mw/Rv
5:
6: LMSW Ew
7: INVLPG Mb | SWAPGS (000),(o64),(11B) | RDTSCP (001),(11B)
EndTable

GrpTable: Grp8
4: BT
5: BTS
6: BTR
7: BTC
EndTable

GrpTable: Grp9
1: CMPXCHG8B/16B Mq/Mdq
6: VMPTRLD Mq | VMCLEAR Mq (66) | VMXON Mq (F3)
7: VMPTRST Mq
EndTable

GrpTable: Grp10
EndTable

GrpTable: Grp11
0: MOV
EndTable

GrpTable: Grp12
EndTable

GrpTable: Grp13
EndTable

GrpTable: Grp14
EndTable

GrpTable: Grp15
0: fxsave
1: fxstor
2: ldmxcsr
3: stmxcsr
4: XSAVE
5: XRSTOR | lfence (11B)
6: mfence (11B)
7: clflush | sfence (11B)
EndTable

GrpTable: Grp16
0: prefetch NTA
1: prefetch T0
2: prefetch T1
3: prefetch T2
EndTable

                 reply	other threads:[~2009-05-04 16:38 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49FF1A17.5040706@redhat.com \
    --to=mhiramat@redhat.com \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ananth@in.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=avi@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jkenisto@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=roland@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=systemtap@sources.redhat.com \
    --cc=vegard.nossum@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.