linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
@ 2015-08-31 13:58 Adrian Hunter
  2015-08-31 13:58 ` [PATCH 1/4] perf tools: Add a test for decoding of " Adrian Hunter
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Adrian Hunter @ 2015-08-31 13:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Hi

perf tools has a copy of the x86 instruction decoder for decoding
Intel PT.  This patch set adds a perf tools test to use it to
test new instructions.  Subsequent patches add a few new x86
instructions, or very slightly modify them in the case of MPX.
Those changes affect both perf tools and x86/insn.

I suggest Arnaldo takes all these patches as they mainly affect
perf tools, at least in terms of lines-of-code.


Adrian Hunter (4):
      perf tools: Add a test for decoding of new x86 instructions
      x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions
      x86/insn: perf tools: Add new SHA instructions
      x86/insn: perf tools: Add new memory instructions

 arch/x86/lib/x86-opcode-map.txt                    |  19 +-
 tools/perf/tests/Build                             |   3 +
 tools/perf/tests/builtin-test.c                    |   8 +
 tools/perf/tests/gen-insn-x86-dat.awk              |  75 ++
 tools/perf/tests/gen-insn-x86-dat.sh               |  43 ++
 tools/perf/tests/insn-x86-dat-32.c                 | 640 ++++++++++++++++
 tools/perf/tests/insn-x86-dat-64.c                 | 738 ++++++++++++++++++
 tools/perf/tests/insn-x86-dat-src.c                | 835 +++++++++++++++++++++
 tools/perf/tests/insn-x86.c                        | 180 +++++
 tools/perf/tests/tests.h                           |   1 +
 .../perf/util/intel-pt-decoder/x86-opcode-map.txt  |  19 +-
 11 files changed, 2553 insertions(+), 8 deletions(-)
 create mode 100644 tools/perf/tests/gen-insn-x86-dat.awk
 create mode 100755 tools/perf/tests/gen-insn-x86-dat.sh
 create mode 100644 tools/perf/tests/insn-x86-dat-32.c
 create mode 100644 tools/perf/tests/insn-x86-dat-64.c
 create mode 100644 tools/perf/tests/insn-x86-dat-src.c
 create mode 100644 tools/perf/tests/insn-x86.c


Regards
Adrian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/4] perf tools: Add a test for decoding of new x86 instructions
  2015-08-31 13:58 [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Adrian Hunter
@ 2015-08-31 13:58 ` Adrian Hunter
  2015-09-01  0:18   ` 平松雅巳 / HIRAMATU,MASAMI
  2015-08-31 13:58 ` [PATCH 2/4] x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions Adrian Hunter
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Adrian Hunter @ 2015-08-31 13:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Add a new test titled:

	Test x86 instruction decoder - new instructions

The purpose of this test is to check the instruction decoder
after new instructions have been added.  Initially, MPX
instructions are tested which are already supported, but the
definitions in x86-opcode-map.txt will be tweaked in a
subsequent patch, after which this test can be run to verify
those changes.

The data for the test comes from assembly language instructions
in insn-x86-dat-src.c which is converted into bytes by the scripts
gen-insn-x86-dat.sh and gen-insn-x86-dat.awk, and included
into the test program insn-x86.c as insn-x86-dat-32.c and
insn-x86-dat-64.c.  The conversion is not done as part of the
perf tools build because the test data must be under (git)
change control in order for the test to be repeatably-correct.
Also it may require a recent version of binutils.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/tests/Build                |   3 +
 tools/perf/tests/builtin-test.c       |   8 +
 tools/perf/tests/gen-insn-x86-dat.awk |  75 ++++++
 tools/perf/tests/gen-insn-x86-dat.sh  |  43 ++++
 tools/perf/tests/insn-x86-dat-32.c    | 324 ++++++++++++++++++++++++++
 tools/perf/tests/insn-x86-dat-64.c    | 340 +++++++++++++++++++++++++++
 tools/perf/tests/insn-x86-dat-src.c   | 416 ++++++++++++++++++++++++++++++++++
 tools/perf/tests/insn-x86.c           | 180 +++++++++++++++
 tools/perf/tests/tests.h              |   1 +
 9 files changed, 1390 insertions(+)
 create mode 100644 tools/perf/tests/gen-insn-x86-dat.awk
 create mode 100755 tools/perf/tests/gen-insn-x86-dat.sh
 create mode 100644 tools/perf/tests/insn-x86-dat-32.c
 create mode 100644 tools/perf/tests/insn-x86-dat-64.c
 create mode 100644 tools/perf/tests/insn-x86-dat-src.c
 create mode 100644 tools/perf/tests/insn-x86.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index c1518bdd0f1b..51fb737f82fc 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -35,6 +35,9 @@ perf-y += thread-map.o
 perf-y += llvm.o
 
 perf-$(CONFIG_X86) += perf-time-to-tsc.o
+ifdef CONFIG_AUXTRACE
+perf-$(CONFIG_X86) += insn-x86.o
+endif
 
 ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
 perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 136cd934be66..69a77f71d594 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -178,6 +178,14 @@ static struct test {
 		.desc = "Test LLVM searching and compiling",
 		.func = test__llvm,
 	},
+#ifdef HAVE_AUXTRACE_SUPPORT
+#if defined(__x86_64__) || defined(__i386__)
+	{
+		.desc = "Test x86 instruction decoder - new instructions",
+		.func = test__insn_x86,
+	},
+#endif
+#endif
 	{
 		.func = NULL,
 	},
diff --git a/tools/perf/tests/gen-insn-x86-dat.awk b/tools/perf/tests/gen-insn-x86-dat.awk
new file mode 100644
index 000000000000..a21454835cd4
--- /dev/null
+++ b/tools/perf/tests/gen-insn-x86-dat.awk
@@ -0,0 +1,75 @@
+#!/bin/awk -f
+# gen-insn-x86-dat.awk: script to convert data for the insn-x86 test
+# Copyright (c) 2015, Intel Corporation.
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+# This program is distributed in the hope it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+
+BEGIN {
+	print "/*"
+	print " * Generated by gen-insn-x86-dat.sh and gen-insn-x86-dat.awk"
+	print " * from insn-x86-dat-src.c for inclusion by insn-x86.c"
+	print " * Do not change this code."
+	print "*/\n"
+	op = ""
+	branch = ""
+	rel = 0
+	going = 0
+}
+
+/ Start here / {
+	going = 1
+}
+
+/ Stop here / {
+	going = 0
+}
+
+/^\s*[0-9a-fA-F]+\:/ {
+	if (going) {
+		colon_pos = index($0, ":")
+		useful_line = substr($0, colon_pos + 1)
+		first_pos = match(useful_line, "[0-9a-fA-F]")
+		useful_line = substr(useful_line, first_pos)
+		gsub("\t", "\\t", useful_line)
+		printf "{{"
+		len = 0
+		for (i = 2; i <= NF; i++) {
+			if (match($i, "^[0-9a-fA-F][0-9a-fA-F]$")) {
+				printf "0x%s, ", $i
+				len += 1
+			} else {
+				break
+			}
+		}
+		printf "}, %d, %s, \"%s\", \"%s\",", len, rel, op, branch
+		printf "\n\"%s\",},\n", useful_line
+		op = ""
+		branch = ""
+		rel = 0
+	}
+}
+
+/ Expecting: / {
+	expecting_str = " Expecting: "
+	expecting_len = length(expecting_str)
+	expecting_pos = index($0, expecting_str)
+	useful_line = substr($0, expecting_pos + expecting_len)
+	for (i = 1; i <= NF; i++) {
+		if ($i == "Expecting:") {
+			i++
+			op = $i
+			i++
+			branch = $i
+			i++
+			rel = $i
+			break
+		}
+	}
+}
diff --git a/tools/perf/tests/gen-insn-x86-dat.sh b/tools/perf/tests/gen-insn-x86-dat.sh
new file mode 100755
index 000000000000..2d4ef94cff98
--- /dev/null
+++ b/tools/perf/tests/gen-insn-x86-dat.sh
@@ -0,0 +1,43 @@
+#!/bin/sh
+# gen-insn-x86-dat: generate data for the insn-x86 test
+# Copyright (c) 2015, Intel Corporation.
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+# This program is distributed in the hope it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+
+set -e
+
+if [ "$(uname -m)" != "x86_64" ]; then
+	echo "ERROR: This script only works on x86_64"
+	exit 1
+fi
+
+cd $(dirname $0)
+
+trap 'echo "Might need a more recent version of binutils"' EXIT
+
+echo "Compiling insn-x86-dat-src.c to 64-bit object"
+
+gcc -g -c insn-x86-dat-src.c
+
+objdump -dSw insn-x86-dat-src.o | awk -f gen-insn-x86-dat.awk > insn-x86-dat-64.c
+
+rm -f insn-x86-dat-src.o
+
+echo "Compiling insn-x86-dat-src.c to 32-bit object"
+
+gcc -g -c -m32 insn-x86-dat-src.c
+
+objdump -dSw insn-x86-dat-src.o | awk -f gen-insn-x86-dat.awk > insn-x86-dat-32.c
+
+rm -f insn-x86-dat-src.o
+
+trap - EXIT
+
+echo "Done (use git diff to see the changes)"
diff --git a/tools/perf/tests/insn-x86-dat-32.c b/tools/perf/tests/insn-x86-dat-32.c
new file mode 100644
index 000000000000..6a38a34a5a49
--- /dev/null
+++ b/tools/perf/tests/insn-x86-dat-32.c
@@ -0,0 +1,324 @@
+/*
+ * Generated by gen-insn-x86-dat.sh and gen-insn-x86-dat.awk
+ * from insn-x86-dat-src.c for inclusion by insn-x86.c
+ * Do not change this code.
+*/
+
+{{0x0f, 0x31, }, 2, 0, "", "",
+"0f 31                \trdtsc  ",},
+{{0xf3, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
+"f3 0f 1b 00          \tbndmk  (%eax),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1b 05 78 56 34 12 \tbndmk  0x12345678,%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
+"f3 0f 1b 18          \tbndmk  (%eax),%bnd3",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
+"f3 0f 1b 04 01       \tbndmk  (%ecx,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 04 05 78 56 34 12 \tbndmk  0x12345678(,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
+"f3 0f 1b 04 08       \tbndmk  (%eax,%ecx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
+"f3 0f 1b 04 c8       \tbndmk  (%eax,%ecx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
+"f3 0f 1b 40 12       \tbndmk  0x12(%eax),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
+"f3 0f 1b 45 12       \tbndmk  0x12(%ebp),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 01 12    \tbndmk  0x12(%ecx,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 05 12    \tbndmk  0x12(%ebp,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 08 12    \tbndmk  0x12(%eax,%ecx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 c8 12    \tbndmk  0x12(%eax,%ecx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1b 80 78 56 34 12 \tbndmk  0x12345678(%eax),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1b 85 78 56 34 12 \tbndmk  0x12345678(%ebp),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 01 78 56 34 12 \tbndmk  0x12345678(%ecx,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 05 78 56 34 12 \tbndmk  0x12345678(%ebp,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 08 78 56 34 12 \tbndmk  0x12345678(%eax,%ecx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 c8 78 56 34 12 \tbndmk  0x12345678(%eax,%ecx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
+"f3 0f 1a 00          \tbndcl  (%eax),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1a 05 78 56 34 12 \tbndcl  0x12345678,%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
+"f3 0f 1a 18          \tbndcl  (%eax),%bnd3",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
+"f3 0f 1a 04 01       \tbndcl  (%ecx,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 04 05 78 56 34 12 \tbndcl  0x12345678(,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
+"f3 0f 1a 04 08       \tbndcl  (%eax,%ecx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
+"f3 0f 1a 04 c8       \tbndcl  (%eax,%ecx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
+"f3 0f 1a 40 12       \tbndcl  0x12(%eax),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
+"f3 0f 1a 45 12       \tbndcl  0x12(%ebp),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 01 12    \tbndcl  0x12(%ecx,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 05 12    \tbndcl  0x12(%ebp,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 08 12    \tbndcl  0x12(%eax,%ecx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 c8 12    \tbndcl  0x12(%eax,%ecx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1a 80 78 56 34 12 \tbndcl  0x12345678(%eax),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1a 85 78 56 34 12 \tbndcl  0x12345678(%ebp),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 01 78 56 34 12 \tbndcl  0x12345678(%ecx,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 05 78 56 34 12 \tbndcl  0x12345678(%ebp,%eax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 08 78 56 34 12 \tbndcl  0x12345678(%eax,%ecx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 c8 78 56 34 12 \tbndcl  0x12345678(%eax,%ecx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
+"f3 0f 1a c0          \tbndcl  %eax,%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
+"f2 0f 1a 00          \tbndcu  (%eax),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1a 05 78 56 34 12 \tbndcu  0x12345678,%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
+"f2 0f 1a 18          \tbndcu  (%eax),%bnd3",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
+"f2 0f 1a 04 01       \tbndcu  (%ecx,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 04 05 78 56 34 12 \tbndcu  0x12345678(,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
+"f2 0f 1a 04 08       \tbndcu  (%eax,%ecx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
+"f2 0f 1a 04 c8       \tbndcu  (%eax,%ecx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
+"f2 0f 1a 40 12       \tbndcu  0x12(%eax),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
+"f2 0f 1a 45 12       \tbndcu  0x12(%ebp),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 01 12    \tbndcu  0x12(%ecx,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 05 12    \tbndcu  0x12(%ebp,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 08 12    \tbndcu  0x12(%eax,%ecx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 c8 12    \tbndcu  0x12(%eax,%ecx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1a 80 78 56 34 12 \tbndcu  0x12345678(%eax),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1a 85 78 56 34 12 \tbndcu  0x12345678(%ebp),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 01 78 56 34 12 \tbndcu  0x12345678(%ecx,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 05 78 56 34 12 \tbndcu  0x12345678(%ebp,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 08 78 56 34 12 \tbndcu  0x12345678(%eax,%ecx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 c8 78 56 34 12 \tbndcu  0x12345678(%eax,%ecx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
+"f2 0f 1a c0          \tbndcu  %eax,%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
+"f2 0f 1b 00          \tbndcn  (%eax),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1b 05 78 56 34 12 \tbndcn  0x12345678,%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
+"f2 0f 1b 18          \tbndcn  (%eax),%bnd3",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
+"f2 0f 1b 04 01       \tbndcn  (%ecx,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 04 05 78 56 34 12 \tbndcn  0x12345678(,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
+"f2 0f 1b 04 08       \tbndcn  (%eax,%ecx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
+"f2 0f 1b 04 c8       \tbndcn  (%eax,%ecx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
+"f2 0f 1b 40 12       \tbndcn  0x12(%eax),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
+"f2 0f 1b 45 12       \tbndcn  0x12(%ebp),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 01 12    \tbndcn  0x12(%ecx,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 05 12    \tbndcn  0x12(%ebp,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 08 12    \tbndcn  0x12(%eax,%ecx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 c8 12    \tbndcn  0x12(%eax,%ecx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1b 80 78 56 34 12 \tbndcn  0x12345678(%eax),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1b 85 78 56 34 12 \tbndcn  0x12345678(%ebp),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 01 78 56 34 12 \tbndcn  0x12345678(%ecx,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 05 78 56 34 12 \tbndcn  0x12345678(%ebp,%eax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 08 78 56 34 12 \tbndcn  0x12345678(%eax,%ecx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 c8 78 56 34 12 \tbndcn  0x12345678(%eax,%ecx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0xc0, }, 4, 0, "", "",
+"f2 0f 1b c0          \tbndcn  %eax,%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
+"66 0f 1a 00          \tbndmov (%eax),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1a 05 78 56 34 12 \tbndmov 0x12345678,%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
+"66 0f 1a 18          \tbndmov (%eax),%bnd3",},
+{{0x66, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
+"66 0f 1a 04 01       \tbndmov (%ecx,%eax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 04 05 78 56 34 12 \tbndmov 0x12345678(,%eax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
+"66 0f 1a 04 08       \tbndmov (%eax,%ecx,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
+"66 0f 1a 04 c8       \tbndmov (%eax,%ecx,8),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
+"66 0f 1a 40 12       \tbndmov 0x12(%eax),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
+"66 0f 1a 45 12       \tbndmov 0x12(%ebp),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 01 12    \tbndmov 0x12(%ecx,%eax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 05 12    \tbndmov 0x12(%ebp,%eax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 08 12    \tbndmov 0x12(%eax,%ecx,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 c8 12    \tbndmov 0x12(%eax,%ecx,8),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1a 80 78 56 34 12 \tbndmov 0x12345678(%eax),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1a 85 78 56 34 12 \tbndmov 0x12345678(%ebp),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 01 78 56 34 12 \tbndmov 0x12345678(%ecx,%eax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 05 78 56 34 12 \tbndmov 0x12345678(%ebp,%eax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 08 78 56 34 12 \tbndmov 0x12345678(%eax,%ecx,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 c8 78 56 34 12 \tbndmov 0x12345678(%eax,%ecx,8),%bnd0",},
+{{0x66, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
+"66 0f 1b 00          \tbndmov %bnd0,(%eax)",},
+{{0x66, 0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1b 05 78 56 34 12 \tbndmov %bnd0,0x12345678",},
+{{0x66, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
+"66 0f 1b 18          \tbndmov %bnd3,(%eax)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
+"66 0f 1b 04 01       \tbndmov %bnd0,(%ecx,%eax,1)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 04 05 78 56 34 12 \tbndmov %bnd0,0x12345678(,%eax,1)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
+"66 0f 1b 04 08       \tbndmov %bnd0,(%eax,%ecx,1)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
+"66 0f 1b 04 c8       \tbndmov %bnd0,(%eax,%ecx,8)",},
+{{0x66, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
+"66 0f 1b 40 12       \tbndmov %bnd0,0x12(%eax)",},
+{{0x66, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
+"66 0f 1b 45 12       \tbndmov %bnd0,0x12(%ebp)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 01 12    \tbndmov %bnd0,0x12(%ecx,%eax,1)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 05 12    \tbndmov %bnd0,0x12(%ebp,%eax,1)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 08 12    \tbndmov %bnd0,0x12(%eax,%ecx,1)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 c8 12    \tbndmov %bnd0,0x12(%eax,%ecx,8)",},
+{{0x66, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1b 80 78 56 34 12 \tbndmov %bnd0,0x12345678(%eax)",},
+{{0x66, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1b 85 78 56 34 12 \tbndmov %bnd0,0x12345678(%ebp)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 01 78 56 34 12 \tbndmov %bnd0,0x12345678(%ecx,%eax,1)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 05 78 56 34 12 \tbndmov %bnd0,0x12345678(%ebp,%eax,1)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 08 78 56 34 12 \tbndmov %bnd0,0x12345678(%eax,%ecx,1)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 c8 78 56 34 12 \tbndmov %bnd0,0x12345678(%eax,%ecx,8)",},
+{{0x66, 0x0f, 0x1a, 0xc8, }, 4, 0, "", "",
+"66 0f 1a c8          \tbndmov %bnd0,%bnd1",},
+{{0x66, 0x0f, 0x1a, 0xc1, }, 4, 0, "", "",
+"66 0f 1a c1          \tbndmov %bnd1,%bnd0",},
+{{0x0f, 0x1a, 0x00, }, 3, 0, "", "",
+"0f 1a 00             \tbndldx (%eax),%bnd0",},
+{{0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1a 05 78 56 34 12 \tbndldx 0x12345678,%bnd0",},
+{{0x0f, 0x1a, 0x18, }, 3, 0, "", "",
+"0f 1a 18             \tbndldx (%eax),%bnd3",},
+{{0x0f, 0x1a, 0x04, 0x01, }, 4, 0, "", "",
+"0f 1a 04 01          \tbndldx (%ecx,%eax,1),%bnd0",},
+{{0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 04 05 78 56 34 12 \tbndldx 0x12345678(,%eax,1),%bnd0",},
+{{0x0f, 0x1a, 0x04, 0x08, }, 4, 0, "", "",
+"0f 1a 04 08          \tbndldx (%eax,%ecx,1),%bnd0",},
+{{0x0f, 0x1a, 0x40, 0x12, }, 4, 0, "", "",
+"0f 1a 40 12          \tbndldx 0x12(%eax),%bnd0",},
+{{0x0f, 0x1a, 0x45, 0x12, }, 4, 0, "", "",
+"0f 1a 45 12          \tbndldx 0x12(%ebp),%bnd0",},
+{{0x0f, 0x1a, 0x44, 0x01, 0x12, }, 5, 0, "", "",
+"0f 1a 44 01 12       \tbndldx 0x12(%ecx,%eax,1),%bnd0",},
+{{0x0f, 0x1a, 0x44, 0x05, 0x12, }, 5, 0, "", "",
+"0f 1a 44 05 12       \tbndldx 0x12(%ebp,%eax,1),%bnd0",},
+{{0x0f, 0x1a, 0x44, 0x08, 0x12, }, 5, 0, "", "",
+"0f 1a 44 08 12       \tbndldx 0x12(%eax,%ecx,1),%bnd0",},
+{{0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1a 80 78 56 34 12 \tbndldx 0x12345678(%eax),%bnd0",},
+{{0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1a 85 78 56 34 12 \tbndldx 0x12345678(%ebp),%bnd0",},
+{{0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 84 01 78 56 34 12 \tbndldx 0x12345678(%ecx,%eax,1),%bnd0",},
+{{0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 84 05 78 56 34 12 \tbndldx 0x12345678(%ebp,%eax,1),%bnd0",},
+{{0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 84 08 78 56 34 12 \tbndldx 0x12345678(%eax,%ecx,1),%bnd0",},
+{{0x0f, 0x1b, 0x00, }, 3, 0, "", "",
+"0f 1b 00             \tbndstx %bnd0,(%eax)",},
+{{0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1b 05 78 56 34 12 \tbndstx %bnd0,0x12345678",},
+{{0x0f, 0x1b, 0x18, }, 3, 0, "", "",
+"0f 1b 18             \tbndstx %bnd3,(%eax)",},
+{{0x0f, 0x1b, 0x04, 0x01, }, 4, 0, "", "",
+"0f 1b 04 01          \tbndstx %bnd0,(%ecx,%eax,1)",},
+{{0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 04 05 78 56 34 12 \tbndstx %bnd0,0x12345678(,%eax,1)",},
+{{0x0f, 0x1b, 0x04, 0x08, }, 4, 0, "", "",
+"0f 1b 04 08          \tbndstx %bnd0,(%eax,%ecx,1)",},
+{{0x0f, 0x1b, 0x40, 0x12, }, 4, 0, "", "",
+"0f 1b 40 12          \tbndstx %bnd0,0x12(%eax)",},
+{{0x0f, 0x1b, 0x45, 0x12, }, 4, 0, "", "",
+"0f 1b 45 12          \tbndstx %bnd0,0x12(%ebp)",},
+{{0x0f, 0x1b, 0x44, 0x01, 0x12, }, 5, 0, "", "",
+"0f 1b 44 01 12       \tbndstx %bnd0,0x12(%ecx,%eax,1)",},
+{{0x0f, 0x1b, 0x44, 0x05, 0x12, }, 5, 0, "", "",
+"0f 1b 44 05 12       \tbndstx %bnd0,0x12(%ebp,%eax,1)",},
+{{0x0f, 0x1b, 0x44, 0x08, 0x12, }, 5, 0, "", "",
+"0f 1b 44 08 12       \tbndstx %bnd0,0x12(%eax,%ecx,1)",},
+{{0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1b 80 78 56 34 12 \tbndstx %bnd0,0x12345678(%eax)",},
+{{0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1b 85 78 56 34 12 \tbndstx %bnd0,0x12345678(%ebp)",},
+{{0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 84 01 78 56 34 12 \tbndstx %bnd0,0x12345678(%ecx,%eax,1)",},
+{{0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 84 05 78 56 34 12 \tbndstx %bnd0,0x12345678(%ebp,%eax,1)",},
+{{0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 84 08 78 56 34 12 \tbndstx %bnd0,0x12345678(%eax,%ecx,1)",},
+{{0xf2, 0xe8, 0xfc, 0xff, 0xff, 0xff, }, 6, 0xfffffffc, "call", "unconditional",
+"f2 e8 fc ff ff ff    \tbnd call 3c3 <main+0x3c3>",},
+{{0xf2, 0xff, 0x10, }, 3, 0, "call", "indirect",
+"f2 ff 10             \tbnd call *(%eax)",},
+{{0xf2, 0xc3, }, 2, 0, "ret", "indirect",
+"f2 c3                \tbnd ret ",},
+{{0xf2, 0xe9, 0xfc, 0xff, 0xff, 0xff, }, 6, 0xfffffffc, "jmp", "unconditional",
+"f2 e9 fc ff ff ff    \tbnd jmp 3ce <main+0x3ce>",},
+{{0xf2, 0xe9, 0xfc, 0xff, 0xff, 0xff, }, 6, 0xfffffffc, "jmp", "unconditional",
+"f2 e9 fc ff ff ff    \tbnd jmp 3d4 <main+0x3d4>",},
+{{0xf2, 0xff, 0x21, }, 3, 0, "jmp", "indirect",
+"f2 ff 21             \tbnd jmp *(%ecx)",},
+{{0xf2, 0x0f, 0x85, 0xfc, 0xff, 0xff, 0xff, }, 7, 0xfffffffc, "jcc", "conditional",
+"f2 0f 85 fc ff ff ff \tbnd jne 3de <main+0x3de>",},
diff --git a/tools/perf/tests/insn-x86-dat-64.c b/tools/perf/tests/insn-x86-dat-64.c
new file mode 100644
index 000000000000..01122421a776
--- /dev/null
+++ b/tools/perf/tests/insn-x86-dat-64.c
@@ -0,0 +1,340 @@
+/*
+ * Generated by gen-insn-x86-dat.sh and gen-insn-x86-dat.awk
+ * from insn-x86-dat-src.c for inclusion by insn-x86.c
+ * Do not change this code.
+*/
+
+{{0x0f, 0x31, }, 2, 0, "", "",
+"0f 31                \trdtsc  ",},
+{{0xf3, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
+"f3 0f 1b 00          \tbndmk  (%rax),%bnd0",},
+{{0xf3, 0x41, 0x0f, 0x1b, 0x00, }, 5, 0, "", "",
+"f3 41 0f 1b 00       \tbndmk  (%r8),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 04 25 78 56 34 12 \tbndmk  0x12345678,%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
+"f3 0f 1b 18          \tbndmk  (%rax),%bnd3",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
+"f3 0f 1b 04 01       \tbndmk  (%rcx,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 04 05 78 56 34 12 \tbndmk  0x12345678(,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
+"f3 0f 1b 04 08       \tbndmk  (%rax,%rcx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
+"f3 0f 1b 04 c8       \tbndmk  (%rax,%rcx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
+"f3 0f 1b 40 12       \tbndmk  0x12(%rax),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
+"f3 0f 1b 45 12       \tbndmk  0x12(%rbp),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 01 12    \tbndmk  0x12(%rcx,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 05 12    \tbndmk  0x12(%rbp,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 08 12    \tbndmk  0x12(%rax,%rcx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f3 0f 1b 44 c8 12    \tbndmk  0x12(%rax,%rcx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1b 80 78 56 34 12 \tbndmk  0x12345678(%rax),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1b 85 78 56 34 12 \tbndmk  0x12345678(%rbp),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 01 78 56 34 12 \tbndmk  0x12345678(%rcx,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 05 78 56 34 12 \tbndmk  0x12345678(%rbp,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 08 78 56 34 12 \tbndmk  0x12345678(%rax,%rcx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1b 84 c8 78 56 34 12 \tbndmk  0x12345678(%rax,%rcx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
+"f3 0f 1a 00          \tbndcl  (%rax),%bnd0",},
+{{0xf3, 0x41, 0x0f, 0x1a, 0x00, }, 5, 0, "", "",
+"f3 41 0f 1a 00       \tbndcl  (%r8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 04 25 78 56 34 12 \tbndcl  0x12345678,%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
+"f3 0f 1a 18          \tbndcl  (%rax),%bnd3",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
+"f3 0f 1a 04 01       \tbndcl  (%rcx,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 04 05 78 56 34 12 \tbndcl  0x12345678(,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
+"f3 0f 1a 04 08       \tbndcl  (%rax,%rcx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
+"f3 0f 1a 04 c8       \tbndcl  (%rax,%rcx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
+"f3 0f 1a 40 12       \tbndcl  0x12(%rax),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
+"f3 0f 1a 45 12       \tbndcl  0x12(%rbp),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 01 12    \tbndcl  0x12(%rcx,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 05 12    \tbndcl  0x12(%rbp,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 08 12    \tbndcl  0x12(%rax,%rcx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f3 0f 1a 44 c8 12    \tbndcl  0x12(%rax,%rcx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1a 80 78 56 34 12 \tbndcl  0x12345678(%rax),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f3 0f 1a 85 78 56 34 12 \tbndcl  0x12345678(%rbp),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 01 78 56 34 12 \tbndcl  0x12345678(%rcx,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 05 78 56 34 12 \tbndcl  0x12345678(%rbp,%rax,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 08 78 56 34 12 \tbndcl  0x12345678(%rax,%rcx,1),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f3 0f 1a 84 c8 78 56 34 12 \tbndcl  0x12345678(%rax,%rcx,8),%bnd0",},
+{{0xf3, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
+"f3 0f 1a c0          \tbndcl  %rax,%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
+"f2 0f 1a 00          \tbndcu  (%rax),%bnd0",},
+{{0xf2, 0x41, 0x0f, 0x1a, 0x00, }, 5, 0, "", "",
+"f2 41 0f 1a 00       \tbndcu  (%r8),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 04 25 78 56 34 12 \tbndcu  0x12345678,%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
+"f2 0f 1a 18          \tbndcu  (%rax),%bnd3",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
+"f2 0f 1a 04 01       \tbndcu  (%rcx,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 04 05 78 56 34 12 \tbndcu  0x12345678(,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
+"f2 0f 1a 04 08       \tbndcu  (%rax,%rcx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
+"f2 0f 1a 04 c8       \tbndcu  (%rax,%rcx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
+"f2 0f 1a 40 12       \tbndcu  0x12(%rax),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
+"f2 0f 1a 45 12       \tbndcu  0x12(%rbp),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 01 12    \tbndcu  0x12(%rcx,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 05 12    \tbndcu  0x12(%rbp,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 08 12    \tbndcu  0x12(%rax,%rcx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f2 0f 1a 44 c8 12    \tbndcu  0x12(%rax,%rcx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1a 80 78 56 34 12 \tbndcu  0x12345678(%rax),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1a 85 78 56 34 12 \tbndcu  0x12345678(%rbp),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 01 78 56 34 12 \tbndcu  0x12345678(%rcx,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 05 78 56 34 12 \tbndcu  0x12345678(%rbp,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 08 78 56 34 12 \tbndcu  0x12345678(%rax,%rcx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1a 84 c8 78 56 34 12 \tbndcu  0x12345678(%rax,%rcx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
+"f2 0f 1a c0          \tbndcu  %rax,%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
+"f2 0f 1b 00          \tbndcn  (%rax),%bnd0",},
+{{0xf2, 0x41, 0x0f, 0x1b, 0x00, }, 5, 0, "", "",
+"f2 41 0f 1b 00       \tbndcn  (%r8),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 04 25 78 56 34 12 \tbndcn  0x12345678,%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
+"f2 0f 1b 18          \tbndcn  (%rax),%bnd3",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
+"f2 0f 1b 04 01       \tbndcn  (%rcx,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 04 05 78 56 34 12 \tbndcn  0x12345678(,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
+"f2 0f 1b 04 08       \tbndcn  (%rax,%rcx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
+"f2 0f 1b 04 c8       \tbndcn  (%rax,%rcx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
+"f2 0f 1b 40 12       \tbndcn  0x12(%rax),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
+"f2 0f 1b 45 12       \tbndcn  0x12(%rbp),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 01 12    \tbndcn  0x12(%rcx,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 05 12    \tbndcn  0x12(%rbp,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 08 12    \tbndcn  0x12(%rax,%rcx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"f2 0f 1b 44 c8 12    \tbndcn  0x12(%rax,%rcx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1b 80 78 56 34 12 \tbndcn  0x12345678(%rax),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"f2 0f 1b 85 78 56 34 12 \tbndcn  0x12345678(%rbp),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 01 78 56 34 12 \tbndcn  0x12345678(%rcx,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 05 78 56 34 12 \tbndcn  0x12345678(%rbp,%rax,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 08 78 56 34 12 \tbndcn  0x12345678(%rax,%rcx,1),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"f2 0f 1b 84 c8 78 56 34 12 \tbndcn  0x12345678(%rax,%rcx,8),%bnd0",},
+{{0xf2, 0x0f, 0x1b, 0xc0, }, 4, 0, "", "",
+"f2 0f 1b c0          \tbndcn  %rax,%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
+"66 0f 1a 00          \tbndmov (%rax),%bnd0",},
+{{0x66, 0x41, 0x0f, 0x1a, 0x00, }, 5, 0, "", "",
+"66 41 0f 1a 00       \tbndmov (%r8),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 04 25 78 56 34 12 \tbndmov 0x12345678,%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
+"66 0f 1a 18          \tbndmov (%rax),%bnd3",},
+{{0x66, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
+"66 0f 1a 04 01       \tbndmov (%rcx,%rax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 04 05 78 56 34 12 \tbndmov 0x12345678(,%rax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
+"66 0f 1a 04 08       \tbndmov (%rax,%rcx,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
+"66 0f 1a 04 c8       \tbndmov (%rax,%rcx,8),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
+"66 0f 1a 40 12       \tbndmov 0x12(%rax),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
+"66 0f 1a 45 12       \tbndmov 0x12(%rbp),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 01 12    \tbndmov 0x12(%rcx,%rax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 05 12    \tbndmov 0x12(%rbp,%rax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 08 12    \tbndmov 0x12(%rax,%rcx,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"66 0f 1a 44 c8 12    \tbndmov 0x12(%rax,%rcx,8),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1a 80 78 56 34 12 \tbndmov 0x12345678(%rax),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1a 85 78 56 34 12 \tbndmov 0x12345678(%rbp),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 01 78 56 34 12 \tbndmov 0x12345678(%rcx,%rax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 05 78 56 34 12 \tbndmov 0x12345678(%rbp,%rax,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 08 78 56 34 12 \tbndmov 0x12345678(%rax,%rcx,1),%bnd0",},
+{{0x66, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1a 84 c8 78 56 34 12 \tbndmov 0x12345678(%rax,%rcx,8),%bnd0",},
+{{0x66, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
+"66 0f 1b 00          \tbndmov %bnd0,(%rax)",},
+{{0x66, 0x41, 0x0f, 0x1b, 0x00, }, 5, 0, "", "",
+"66 41 0f 1b 00       \tbndmov %bnd0,(%r8)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 04 25 78 56 34 12 \tbndmov %bnd0,0x12345678",},
+{{0x66, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
+"66 0f 1b 18          \tbndmov %bnd3,(%rax)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
+"66 0f 1b 04 01       \tbndmov %bnd0,(%rcx,%rax,1)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 04 05 78 56 34 12 \tbndmov %bnd0,0x12345678(,%rax,1)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
+"66 0f 1b 04 08       \tbndmov %bnd0,(%rax,%rcx,1)",},
+{{0x66, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
+"66 0f 1b 04 c8       \tbndmov %bnd0,(%rax,%rcx,8)",},
+{{0x66, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
+"66 0f 1b 40 12       \tbndmov %bnd0,0x12(%rax)",},
+{{0x66, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
+"66 0f 1b 45 12       \tbndmov %bnd0,0x12(%rbp)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 01 12    \tbndmov %bnd0,0x12(%rcx,%rax,1)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 05 12    \tbndmov %bnd0,0x12(%rbp,%rax,1)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 08 12    \tbndmov %bnd0,0x12(%rax,%rcx,1)",},
+{{0x66, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"66 0f 1b 44 c8 12    \tbndmov %bnd0,0x12(%rax,%rcx,8)",},
+{{0x66, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1b 80 78 56 34 12 \tbndmov %bnd0,0x12345678(%rax)",},
+{{0x66, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f 1b 85 78 56 34 12 \tbndmov %bnd0,0x12345678(%rbp)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 01 78 56 34 12 \tbndmov %bnd0,0x12345678(%rcx,%rax,1)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 05 78 56 34 12 \tbndmov %bnd0,0x12345678(%rbp,%rax,1)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 08 78 56 34 12 \tbndmov %bnd0,0x12345678(%rax,%rcx,1)",},
+{{0x66, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f 1b 84 c8 78 56 34 12 \tbndmov %bnd0,0x12345678(%rax,%rcx,8)",},
+{{0x66, 0x0f, 0x1a, 0xc8, }, 4, 0, "", "",
+"66 0f 1a c8          \tbndmov %bnd0,%bnd1",},
+{{0x66, 0x0f, 0x1a, 0xc1, }, 4, 0, "", "",
+"66 0f 1a c1          \tbndmov %bnd1,%bnd0",},
+{{0x0f, 0x1a, 0x00, }, 3, 0, "", "",
+"0f 1a 00             \tbndldx (%rax),%bnd0",},
+{{0x41, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
+"41 0f 1a 00          \tbndldx (%r8),%bnd0",},
+{{0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 04 25 78 56 34 12 \tbndldx 0x12345678,%bnd0",},
+{{0x0f, 0x1a, 0x18, }, 3, 0, "", "",
+"0f 1a 18             \tbndldx (%rax),%bnd3",},
+{{0x0f, 0x1a, 0x04, 0x01, }, 4, 0, "", "",
+"0f 1a 04 01          \tbndldx (%rcx,%rax,1),%bnd0",},
+{{0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 04 05 78 56 34 12 \tbndldx 0x12345678(,%rax,1),%bnd0",},
+{{0x0f, 0x1a, 0x04, 0x08, }, 4, 0, "", "",
+"0f 1a 04 08          \tbndldx (%rax,%rcx,1),%bnd0",},
+{{0x0f, 0x1a, 0x40, 0x12, }, 4, 0, "", "",
+"0f 1a 40 12          \tbndldx 0x12(%rax),%bnd0",},
+{{0x0f, 0x1a, 0x45, 0x12, }, 4, 0, "", "",
+"0f 1a 45 12          \tbndldx 0x12(%rbp),%bnd0",},
+{{0x0f, 0x1a, 0x44, 0x01, 0x12, }, 5, 0, "", "",
+"0f 1a 44 01 12       \tbndldx 0x12(%rcx,%rax,1),%bnd0",},
+{{0x0f, 0x1a, 0x44, 0x05, 0x12, }, 5, 0, "", "",
+"0f 1a 44 05 12       \tbndldx 0x12(%rbp,%rax,1),%bnd0",},
+{{0x0f, 0x1a, 0x44, 0x08, 0x12, }, 5, 0, "", "",
+"0f 1a 44 08 12       \tbndldx 0x12(%rax,%rcx,1),%bnd0",},
+{{0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1a 80 78 56 34 12 \tbndldx 0x12345678(%rax),%bnd0",},
+{{0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1a 85 78 56 34 12 \tbndldx 0x12345678(%rbp),%bnd0",},
+{{0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 84 01 78 56 34 12 \tbndldx 0x12345678(%rcx,%rax,1),%bnd0",},
+{{0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 84 05 78 56 34 12 \tbndldx 0x12345678(%rbp,%rax,1),%bnd0",},
+{{0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1a 84 08 78 56 34 12 \tbndldx 0x12345678(%rax,%rcx,1),%bnd0",},
+{{0x0f, 0x1b, 0x00, }, 3, 0, "", "",
+"0f 1b 00             \tbndstx %bnd0,(%rax)",},
+{{0x41, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
+"41 0f 1b 00          \tbndstx %bnd0,(%r8)",},
+{{0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 04 25 78 56 34 12 \tbndstx %bnd0,0x12345678",},
+{{0x0f, 0x1b, 0x18, }, 3, 0, "", "",
+"0f 1b 18             \tbndstx %bnd3,(%rax)",},
+{{0x0f, 0x1b, 0x04, 0x01, }, 4, 0, "", "",
+"0f 1b 04 01          \tbndstx %bnd0,(%rcx,%rax,1)",},
+{{0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 04 05 78 56 34 12 \tbndstx %bnd0,0x12345678(,%rax,1)",},
+{{0x0f, 0x1b, 0x04, 0x08, }, 4, 0, "", "",
+"0f 1b 04 08          \tbndstx %bnd0,(%rax,%rcx,1)",},
+{{0x0f, 0x1b, 0x40, 0x12, }, 4, 0, "", "",
+"0f 1b 40 12          \tbndstx %bnd0,0x12(%rax)",},
+{{0x0f, 0x1b, 0x45, 0x12, }, 4, 0, "", "",
+"0f 1b 45 12          \tbndstx %bnd0,0x12(%rbp)",},
+{{0x0f, 0x1b, 0x44, 0x01, 0x12, }, 5, 0, "", "",
+"0f 1b 44 01 12       \tbndstx %bnd0,0x12(%rcx,%rax,1)",},
+{{0x0f, 0x1b, 0x44, 0x05, 0x12, }, 5, 0, "", "",
+"0f 1b 44 05 12       \tbndstx %bnd0,0x12(%rbp,%rax,1)",},
+{{0x0f, 0x1b, 0x44, 0x08, 0x12, }, 5, 0, "", "",
+"0f 1b 44 08 12       \tbndstx %bnd0,0x12(%rax,%rcx,1)",},
+{{0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1b 80 78 56 34 12 \tbndstx %bnd0,0x12345678(%rax)",},
+{{0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
+"0f 1b 85 78 56 34 12 \tbndstx %bnd0,0x12345678(%rbp)",},
+{{0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 84 01 78 56 34 12 \tbndstx %bnd0,0x12345678(%rcx,%rax,1)",},
+{{0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 84 05 78 56 34 12 \tbndstx %bnd0,0x12345678(%rbp,%rax,1)",},
+{{0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 1b 84 08 78 56 34 12 \tbndstx %bnd0,0x12345678(%rax,%rcx,1)",},
+{{0xf2, 0xe8, 0x00, 0x00, 0x00, 0x00, }, 6, 0, "call", "unconditional",
+"f2 e8 00 00 00 00    \tbnd callq 3f6 <main+0x3f6>",},
+{{0x67, 0xf2, 0xff, 0x10, }, 4, 0, "call", "indirect",
+"67 f2 ff 10          \tbnd callq *(%eax)",},
+{{0xf2, 0xc3, }, 2, 0, "ret", "indirect",
+"f2 c3                \tbnd retq ",},
+{{0xf2, 0xe9, 0x00, 0x00, 0x00, 0x00, }, 6, 0, "jmp", "unconditional",
+"f2 e9 00 00 00 00    \tbnd jmpq 402 <main+0x402>",},
+{{0xf2, 0xe9, 0x00, 0x00, 0x00, 0x00, }, 6, 0, "jmp", "unconditional",
+"f2 e9 00 00 00 00    \tbnd jmpq 408 <main+0x408>",},
+{{0x67, 0xf2, 0xff, 0x21, }, 4, 0, "jmp", "indirect",
+"67 f2 ff 21          \tbnd jmpq *(%ecx)",},
+{{0xf2, 0x0f, 0x85, 0x00, 0x00, 0x00, 0x00, }, 7, 0, "jcc", "conditional",
+"f2 0f 85 00 00 00 00 \tbnd jne 413 <main+0x413>",},
diff --git a/tools/perf/tests/insn-x86-dat-src.c b/tools/perf/tests/insn-x86-dat-src.c
new file mode 100644
index 000000000000..b506830f33a8
--- /dev/null
+++ b/tools/perf/tests/insn-x86-dat-src.c
@@ -0,0 +1,416 @@
+/*
+ * This file contains instructions for testing by the test titled:
+ *
+ *         "Test x86 instruction decoder - new instructions"
+ *
+ * Note that the 'Expecting' comment lines are consumed by the
+ * gen-insn-x86-dat.awk script and have the format:
+ *
+ *         Expecting: <op> <branch> <rel>
+ *
+ * If this file is changed, remember to run the gen-insn-x86-dat.sh
+ * script and commit the result.
+ *
+ * Refer to insn-x86.c for more details.
+ */
+
+int main(void)
+{
+	/* Following line is a marker for the awk script - do not change */
+	asm volatile("rdtsc"); /* Start here */
+
+#ifdef __x86_64__
+
+	/* bndmk m64, bnd */
+
+	asm volatile("bndmk (%rax), %bnd0");
+	asm volatile("bndmk (%r8), %bnd0");
+	asm volatile("bndmk (0x12345678), %bnd0");
+	asm volatile("bndmk (%rax), %bnd3");
+	asm volatile("bndmk (%rcx,%rax,1), %bnd0");
+	asm volatile("bndmk 0x12345678(,%rax,1), %bnd0");
+	asm volatile("bndmk (%rax,%rcx,1), %bnd0");
+	asm volatile("bndmk (%rax,%rcx,8), %bnd0");
+	asm volatile("bndmk 0x12(%rax), %bnd0");
+	asm volatile("bndmk 0x12(%rbp), %bnd0");
+	asm volatile("bndmk 0x12(%rcx,%rax,1), %bnd0");
+	asm volatile("bndmk 0x12(%rbp,%rax,1), %bnd0");
+	asm volatile("bndmk 0x12(%rax,%rcx,1), %bnd0");
+	asm volatile("bndmk 0x12(%rax,%rcx,8), %bnd0");
+	asm volatile("bndmk 0x12345678(%rax), %bnd0");
+	asm volatile("bndmk 0x12345678(%rbp), %bnd0");
+	asm volatile("bndmk 0x12345678(%rcx,%rax,1), %bnd0");
+	asm volatile("bndmk 0x12345678(%rbp,%rax,1), %bnd0");
+	asm volatile("bndmk 0x12345678(%rax,%rcx,1), %bnd0");
+	asm volatile("bndmk 0x12345678(%rax,%rcx,8), %bnd0");
+
+	/* bndcl r/m64, bnd */
+
+	asm volatile("bndcl (%rax), %bnd0");
+	asm volatile("bndcl (%r8), %bnd0");
+	asm volatile("bndcl (0x12345678), %bnd0");
+	asm volatile("bndcl (%rax), %bnd3");
+	asm volatile("bndcl (%rcx,%rax,1), %bnd0");
+	asm volatile("bndcl 0x12345678(,%rax,1), %bnd0");
+	asm volatile("bndcl (%rax,%rcx,1), %bnd0");
+	asm volatile("bndcl (%rax,%rcx,8), %bnd0");
+	asm volatile("bndcl 0x12(%rax), %bnd0");
+	asm volatile("bndcl 0x12(%rbp), %bnd0");
+	asm volatile("bndcl 0x12(%rcx,%rax,1), %bnd0");
+	asm volatile("bndcl 0x12(%rbp,%rax,1), %bnd0");
+	asm volatile("bndcl 0x12(%rax,%rcx,1), %bnd0");
+	asm volatile("bndcl 0x12(%rax,%rcx,8), %bnd0");
+	asm volatile("bndcl 0x12345678(%rax), %bnd0");
+	asm volatile("bndcl 0x12345678(%rbp), %bnd0");
+	asm volatile("bndcl 0x12345678(%rcx,%rax,1), %bnd0");
+	asm volatile("bndcl 0x12345678(%rbp,%rax,1), %bnd0");
+	asm volatile("bndcl 0x12345678(%rax,%rcx,1), %bnd0");
+	asm volatile("bndcl 0x12345678(%rax,%rcx,8), %bnd0");
+	asm volatile("bndcl %rax, %bnd0");
+
+	/* bndcu r/m64, bnd */
+
+	asm volatile("bndcu (%rax), %bnd0");
+	asm volatile("bndcu (%r8), %bnd0");
+	asm volatile("bndcu (0x12345678), %bnd0");
+	asm volatile("bndcu (%rax), %bnd3");
+	asm volatile("bndcu (%rcx,%rax,1), %bnd0");
+	asm volatile("bndcu 0x12345678(,%rax,1), %bnd0");
+	asm volatile("bndcu (%rax,%rcx,1), %bnd0");
+	asm volatile("bndcu (%rax,%rcx,8), %bnd0");
+	asm volatile("bndcu 0x12(%rax), %bnd0");
+	asm volatile("bndcu 0x12(%rbp), %bnd0");
+	asm volatile("bndcu 0x12(%rcx,%rax,1), %bnd0");
+	asm volatile("bndcu 0x12(%rbp,%rax,1), %bnd0");
+	asm volatile("bndcu 0x12(%rax,%rcx,1), %bnd0");
+	asm volatile("bndcu 0x12(%rax,%rcx,8), %bnd0");
+	asm volatile("bndcu 0x12345678(%rax), %bnd0");
+	asm volatile("bndcu 0x12345678(%rbp), %bnd0");
+	asm volatile("bndcu 0x12345678(%rcx,%rax,1), %bnd0");
+	asm volatile("bndcu 0x12345678(%rbp,%rax,1), %bnd0");
+	asm volatile("bndcu 0x12345678(%rax,%rcx,1), %bnd0");
+	asm volatile("bndcu 0x12345678(%rax,%rcx,8), %bnd0");
+	asm volatile("bndcu %rax, %bnd0");
+
+	/* bndcn r/m64, bnd */
+
+	asm volatile("bndcn (%rax), %bnd0");
+	asm volatile("bndcn (%r8), %bnd0");
+	asm volatile("bndcn (0x12345678), %bnd0");
+	asm volatile("bndcn (%rax), %bnd3");
+	asm volatile("bndcn (%rcx,%rax,1), %bnd0");
+	asm volatile("bndcn 0x12345678(,%rax,1), %bnd0");
+	asm volatile("bndcn (%rax,%rcx,1), %bnd0");
+	asm volatile("bndcn (%rax,%rcx,8), %bnd0");
+	asm volatile("bndcn 0x12(%rax), %bnd0");
+	asm volatile("bndcn 0x12(%rbp), %bnd0");
+	asm volatile("bndcn 0x12(%rcx,%rax,1), %bnd0");
+	asm volatile("bndcn 0x12(%rbp,%rax,1), %bnd0");
+	asm volatile("bndcn 0x12(%rax,%rcx,1), %bnd0");
+	asm volatile("bndcn 0x12(%rax,%rcx,8), %bnd0");
+	asm volatile("bndcn 0x12345678(%rax), %bnd0");
+	asm volatile("bndcn 0x12345678(%rbp), %bnd0");
+	asm volatile("bndcn 0x12345678(%rcx,%rax,1), %bnd0");
+	asm volatile("bndcn 0x12345678(%rbp,%rax,1), %bnd0");
+	asm volatile("bndcn 0x12345678(%rax,%rcx,1), %bnd0");
+	asm volatile("bndcn 0x12345678(%rax,%rcx,8), %bnd0");
+	asm volatile("bndcn %rax, %bnd0");
+
+	/* bndmov m128, bnd */
+
+	asm volatile("bndmov (%rax), %bnd0");
+	asm volatile("bndmov (%r8), %bnd0");
+	asm volatile("bndmov (0x12345678), %bnd0");
+	asm volatile("bndmov (%rax), %bnd3");
+	asm volatile("bndmov (%rcx,%rax,1), %bnd0");
+	asm volatile("bndmov 0x12345678(,%rax,1), %bnd0");
+	asm volatile("bndmov (%rax,%rcx,1), %bnd0");
+	asm volatile("bndmov (%rax,%rcx,8), %bnd0");
+	asm volatile("bndmov 0x12(%rax), %bnd0");
+	asm volatile("bndmov 0x12(%rbp), %bnd0");
+	asm volatile("bndmov 0x12(%rcx,%rax,1), %bnd0");
+	asm volatile("bndmov 0x12(%rbp,%rax,1), %bnd0");
+	asm volatile("bndmov 0x12(%rax,%rcx,1), %bnd0");
+	asm volatile("bndmov 0x12(%rax,%rcx,8), %bnd0");
+	asm volatile("bndmov 0x12345678(%rax), %bnd0");
+	asm volatile("bndmov 0x12345678(%rbp), %bnd0");
+	asm volatile("bndmov 0x12345678(%rcx,%rax,1), %bnd0");
+	asm volatile("bndmov 0x12345678(%rbp,%rax,1), %bnd0");
+	asm volatile("bndmov 0x12345678(%rax,%rcx,1), %bnd0");
+	asm volatile("bndmov 0x12345678(%rax,%rcx,8), %bnd0");
+
+	/* bndmov bnd, m128 */
+
+	asm volatile("bndmov %bnd0, (%rax)");
+	asm volatile("bndmov %bnd0, (%r8)");
+	asm volatile("bndmov %bnd0, (0x12345678)");
+	asm volatile("bndmov %bnd3, (%rax)");
+	asm volatile("bndmov %bnd0, (%rcx,%rax,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(,%rax,1)");
+	asm volatile("bndmov %bnd0, (%rax,%rcx,1)");
+	asm volatile("bndmov %bnd0, (%rax,%rcx,8)");
+	asm volatile("bndmov %bnd0, 0x12(%rax)");
+	asm volatile("bndmov %bnd0, 0x12(%rbp)");
+	asm volatile("bndmov %bnd0, 0x12(%rcx,%rax,1)");
+	asm volatile("bndmov %bnd0, 0x12(%rbp,%rax,1)");
+	asm volatile("bndmov %bnd0, 0x12(%rax,%rcx,1)");
+	asm volatile("bndmov %bnd0, 0x12(%rax,%rcx,8)");
+	asm volatile("bndmov %bnd0, 0x12345678(%rax)");
+	asm volatile("bndmov %bnd0, 0x12345678(%rbp)");
+	asm volatile("bndmov %bnd0, 0x12345678(%rcx,%rax,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(%rbp,%rax,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(%rax,%rcx,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(%rax,%rcx,8)");
+
+	/* bndmov bnd2, bnd1 */
+
+	asm volatile("bndmov %bnd0, %bnd1");
+	asm volatile("bndmov %bnd1, %bnd0");
+
+	/* bndldx mib, bnd */
+
+	asm volatile("bndldx (%rax), %bnd0");
+	asm volatile("bndldx (%r8), %bnd0");
+	asm volatile("bndldx (0x12345678), %bnd0");
+	asm volatile("bndldx (%rax), %bnd3");
+	asm volatile("bndldx (%rcx,%rax,1), %bnd0");
+	asm volatile("bndldx 0x12345678(,%rax,1), %bnd0");
+	asm volatile("bndldx (%rax,%rcx,1), %bnd0");
+	asm volatile("bndldx 0x12(%rax), %bnd0");
+	asm volatile("bndldx 0x12(%rbp), %bnd0");
+	asm volatile("bndldx 0x12(%rcx,%rax,1), %bnd0");
+	asm volatile("bndldx 0x12(%rbp,%rax,1), %bnd0");
+	asm volatile("bndldx 0x12(%rax,%rcx,1), %bnd0");
+	asm volatile("bndldx 0x12345678(%rax), %bnd0");
+	asm volatile("bndldx 0x12345678(%rbp), %bnd0");
+	asm volatile("bndldx 0x12345678(%rcx,%rax,1), %bnd0");
+	asm volatile("bndldx 0x12345678(%rbp,%rax,1), %bnd0");
+	asm volatile("bndldx 0x12345678(%rax,%rcx,1), %bnd0");
+
+	/* bndstx bnd, mib */
+
+	asm volatile("bndstx %bnd0, (%rax)");
+	asm volatile("bndstx %bnd0, (%r8)");
+	asm volatile("bndstx %bnd0, (0x12345678)");
+	asm volatile("bndstx %bnd3, (%rax)");
+	asm volatile("bndstx %bnd0, (%rcx,%rax,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(,%rax,1)");
+	asm volatile("bndstx %bnd0, (%rax,%rcx,1)");
+	asm volatile("bndstx %bnd0, 0x12(%rax)");
+	asm volatile("bndstx %bnd0, 0x12(%rbp)");
+	asm volatile("bndstx %bnd0, 0x12(%rcx,%rax,1)");
+	asm volatile("bndstx %bnd0, 0x12(%rbp,%rax,1)");
+	asm volatile("bndstx %bnd0, 0x12(%rax,%rcx,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(%rax)");
+	asm volatile("bndstx %bnd0, 0x12345678(%rbp)");
+	asm volatile("bndstx %bnd0, 0x12345678(%rcx,%rax,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(%rbp,%rax,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(%rax,%rcx,1)");
+
+	/* bnd prefix on call, ret, jmp and all jcc */
+
+	asm volatile("bnd call label1");  /* Expecting: call unconditional 0 */
+	asm volatile("bnd call *(%eax)"); /* Expecting: call indirect      0 */
+	asm volatile("bnd ret");          /* Expecting: ret  indirect      0 */
+	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0 */
+	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0 */
+	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
+	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0 */
+
+#else  /* #ifdef __x86_64__ */
+
+	/* bndmk m32, bnd */
+
+	asm volatile("bndmk (%eax), %bnd0");
+	asm volatile("bndmk (0x12345678), %bnd0");
+	asm volatile("bndmk (%eax), %bnd3");
+	asm volatile("bndmk (%ecx,%eax,1), %bnd0");
+	asm volatile("bndmk 0x12345678(,%eax,1), %bnd0");
+	asm volatile("bndmk (%eax,%ecx,1), %bnd0");
+	asm volatile("bndmk (%eax,%ecx,8), %bnd0");
+	asm volatile("bndmk 0x12(%eax), %bnd0");
+	asm volatile("bndmk 0x12(%ebp), %bnd0");
+	asm volatile("bndmk 0x12(%ecx,%eax,1), %bnd0");
+	asm volatile("bndmk 0x12(%ebp,%eax,1), %bnd0");
+	asm volatile("bndmk 0x12(%eax,%ecx,1), %bnd0");
+	asm volatile("bndmk 0x12(%eax,%ecx,8), %bnd0");
+	asm volatile("bndmk 0x12345678(%eax), %bnd0");
+	asm volatile("bndmk 0x12345678(%ebp), %bnd0");
+	asm volatile("bndmk 0x12345678(%ecx,%eax,1), %bnd0");
+	asm volatile("bndmk 0x12345678(%ebp,%eax,1), %bnd0");
+	asm volatile("bndmk 0x12345678(%eax,%ecx,1), %bnd0");
+	asm volatile("bndmk 0x12345678(%eax,%ecx,8), %bnd0");
+
+	/* bndcl r/m32, bnd */
+
+	asm volatile("bndcl (%eax), %bnd0");
+	asm volatile("bndcl (0x12345678), %bnd0");
+	asm volatile("bndcl (%eax), %bnd3");
+	asm volatile("bndcl (%ecx,%eax,1), %bnd0");
+	asm volatile("bndcl 0x12345678(,%eax,1), %bnd0");
+	asm volatile("bndcl (%eax,%ecx,1), %bnd0");
+	asm volatile("bndcl (%eax,%ecx,8), %bnd0");
+	asm volatile("bndcl 0x12(%eax), %bnd0");
+	asm volatile("bndcl 0x12(%ebp), %bnd0");
+	asm volatile("bndcl 0x12(%ecx,%eax,1), %bnd0");
+	asm volatile("bndcl 0x12(%ebp,%eax,1), %bnd0");
+	asm volatile("bndcl 0x12(%eax,%ecx,1), %bnd0");
+	asm volatile("bndcl 0x12(%eax,%ecx,8), %bnd0");
+	asm volatile("bndcl 0x12345678(%eax), %bnd0");
+	asm volatile("bndcl 0x12345678(%ebp), %bnd0");
+	asm volatile("bndcl 0x12345678(%ecx,%eax,1), %bnd0");
+	asm volatile("bndcl 0x12345678(%ebp,%eax,1), %bnd0");
+	asm volatile("bndcl 0x12345678(%eax,%ecx,1), %bnd0");
+	asm volatile("bndcl 0x12345678(%eax,%ecx,8), %bnd0");
+	asm volatile("bndcl %eax, %bnd0");
+
+	/* bndcu r/m32, bnd */
+
+	asm volatile("bndcu (%eax), %bnd0");
+	asm volatile("bndcu (0x12345678), %bnd0");
+	asm volatile("bndcu (%eax), %bnd3");
+	asm volatile("bndcu (%ecx,%eax,1), %bnd0");
+	asm volatile("bndcu 0x12345678(,%eax,1), %bnd0");
+	asm volatile("bndcu (%eax,%ecx,1), %bnd0");
+	asm volatile("bndcu (%eax,%ecx,8), %bnd0");
+	asm volatile("bndcu 0x12(%eax), %bnd0");
+	asm volatile("bndcu 0x12(%ebp), %bnd0");
+	asm volatile("bndcu 0x12(%ecx,%eax,1), %bnd0");
+	asm volatile("bndcu 0x12(%ebp,%eax,1), %bnd0");
+	asm volatile("bndcu 0x12(%eax,%ecx,1), %bnd0");
+	asm volatile("bndcu 0x12(%eax,%ecx,8), %bnd0");
+	asm volatile("bndcu 0x12345678(%eax), %bnd0");
+	asm volatile("bndcu 0x12345678(%ebp), %bnd0");
+	asm volatile("bndcu 0x12345678(%ecx,%eax,1), %bnd0");
+	asm volatile("bndcu 0x12345678(%ebp,%eax,1), %bnd0");
+	asm volatile("bndcu 0x12345678(%eax,%ecx,1), %bnd0");
+	asm volatile("bndcu 0x12345678(%eax,%ecx,8), %bnd0");
+	asm volatile("bndcu %eax, %bnd0");
+
+	/* bndcn r/m32, bnd */
+
+	asm volatile("bndcn (%eax), %bnd0");
+	asm volatile("bndcn (0x12345678), %bnd0");
+	asm volatile("bndcn (%eax), %bnd3");
+	asm volatile("bndcn (%ecx,%eax,1), %bnd0");
+	asm volatile("bndcn 0x12345678(,%eax,1), %bnd0");
+	asm volatile("bndcn (%eax,%ecx,1), %bnd0");
+	asm volatile("bndcn (%eax,%ecx,8), %bnd0");
+	asm volatile("bndcn 0x12(%eax), %bnd0");
+	asm volatile("bndcn 0x12(%ebp), %bnd0");
+	asm volatile("bndcn 0x12(%ecx,%eax,1), %bnd0");
+	asm volatile("bndcn 0x12(%ebp,%eax,1), %bnd0");
+	asm volatile("bndcn 0x12(%eax,%ecx,1), %bnd0");
+	asm volatile("bndcn 0x12(%eax,%ecx,8), %bnd0");
+	asm volatile("bndcn 0x12345678(%eax), %bnd0");
+	asm volatile("bndcn 0x12345678(%ebp), %bnd0");
+	asm volatile("bndcn 0x12345678(%ecx,%eax,1), %bnd0");
+	asm volatile("bndcn 0x12345678(%ebp,%eax,1), %bnd0");
+	asm volatile("bndcn 0x12345678(%eax,%ecx,1), %bnd0");
+	asm volatile("bndcn 0x12345678(%eax,%ecx,8), %bnd0");
+	asm volatile("bndcn %eax, %bnd0");
+
+	/* bndmov m64, bnd */
+
+	asm volatile("bndmov (%eax), %bnd0");
+	asm volatile("bndmov (0x12345678), %bnd0");
+	asm volatile("bndmov (%eax), %bnd3");
+	asm volatile("bndmov (%ecx,%eax,1), %bnd0");
+	asm volatile("bndmov 0x12345678(,%eax,1), %bnd0");
+	asm volatile("bndmov (%eax,%ecx,1), %bnd0");
+	asm volatile("bndmov (%eax,%ecx,8), %bnd0");
+	asm volatile("bndmov 0x12(%eax), %bnd0");
+	asm volatile("bndmov 0x12(%ebp), %bnd0");
+	asm volatile("bndmov 0x12(%ecx,%eax,1), %bnd0");
+	asm volatile("bndmov 0x12(%ebp,%eax,1), %bnd0");
+	asm volatile("bndmov 0x12(%eax,%ecx,1), %bnd0");
+	asm volatile("bndmov 0x12(%eax,%ecx,8), %bnd0");
+	asm volatile("bndmov 0x12345678(%eax), %bnd0");
+	asm volatile("bndmov 0x12345678(%ebp), %bnd0");
+	asm volatile("bndmov 0x12345678(%ecx,%eax,1), %bnd0");
+	asm volatile("bndmov 0x12345678(%ebp,%eax,1), %bnd0");
+	asm volatile("bndmov 0x12345678(%eax,%ecx,1), %bnd0");
+	asm volatile("bndmov 0x12345678(%eax,%ecx,8), %bnd0");
+
+	/* bndmov bnd, m64 */
+
+	asm volatile("bndmov %bnd0, (%eax)");
+	asm volatile("bndmov %bnd0, (0x12345678)");
+	asm volatile("bndmov %bnd3, (%eax)");
+	asm volatile("bndmov %bnd0, (%ecx,%eax,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(,%eax,1)");
+	asm volatile("bndmov %bnd0, (%eax,%ecx,1)");
+	asm volatile("bndmov %bnd0, (%eax,%ecx,8)");
+	asm volatile("bndmov %bnd0, 0x12(%eax)");
+	asm volatile("bndmov %bnd0, 0x12(%ebp)");
+	asm volatile("bndmov %bnd0, 0x12(%ecx,%eax,1)");
+	asm volatile("bndmov %bnd0, 0x12(%ebp,%eax,1)");
+	asm volatile("bndmov %bnd0, 0x12(%eax,%ecx,1)");
+	asm volatile("bndmov %bnd0, 0x12(%eax,%ecx,8)");
+	asm volatile("bndmov %bnd0, 0x12345678(%eax)");
+	asm volatile("bndmov %bnd0, 0x12345678(%ebp)");
+	asm volatile("bndmov %bnd0, 0x12345678(%ecx,%eax,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(%ebp,%eax,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(%eax,%ecx,1)");
+	asm volatile("bndmov %bnd0, 0x12345678(%eax,%ecx,8)");
+
+	/* bndmov bnd2, bnd1 */
+
+	asm volatile("bndmov %bnd0, %bnd1");
+	asm volatile("bndmov %bnd1, %bnd0");
+
+	/* bndldx mib, bnd */
+
+	asm volatile("bndldx (%eax), %bnd0");
+	asm volatile("bndldx (0x12345678), %bnd0");
+	asm volatile("bndldx (%eax), %bnd3");
+	asm volatile("bndldx (%ecx,%eax,1), %bnd0");
+	asm volatile("bndldx 0x12345678(,%eax,1), %bnd0");
+	asm volatile("bndldx (%eax,%ecx,1), %bnd0");
+	asm volatile("bndldx 0x12(%eax), %bnd0");
+	asm volatile("bndldx 0x12(%ebp), %bnd0");
+	asm volatile("bndldx 0x12(%ecx,%eax,1), %bnd0");
+	asm volatile("bndldx 0x12(%ebp,%eax,1), %bnd0");
+	asm volatile("bndldx 0x12(%eax,%ecx,1), %bnd0");
+	asm volatile("bndldx 0x12345678(%eax), %bnd0");
+	asm volatile("bndldx 0x12345678(%ebp), %bnd0");
+	asm volatile("bndldx 0x12345678(%ecx,%eax,1), %bnd0");
+	asm volatile("bndldx 0x12345678(%ebp,%eax,1), %bnd0");
+	asm volatile("bndldx 0x12345678(%eax,%ecx,1), %bnd0");
+
+	/* bndstx bnd, mib */
+
+	asm volatile("bndstx %bnd0, (%eax)");
+	asm volatile("bndstx %bnd0, (0x12345678)");
+	asm volatile("bndstx %bnd3, (%eax)");
+	asm volatile("bndstx %bnd0, (%ecx,%eax,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(,%eax,1)");
+	asm volatile("bndstx %bnd0, (%eax,%ecx,1)");
+	asm volatile("bndstx %bnd0, 0x12(%eax)");
+	asm volatile("bndstx %bnd0, 0x12(%ebp)");
+	asm volatile("bndstx %bnd0, 0x12(%ecx,%eax,1)");
+	asm volatile("bndstx %bnd0, 0x12(%ebp,%eax,1)");
+	asm volatile("bndstx %bnd0, 0x12(%eax,%ecx,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(%eax)");
+	asm volatile("bndstx %bnd0, 0x12345678(%ebp)");
+	asm volatile("bndstx %bnd0, 0x12345678(%ecx,%eax,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(%ebp,%eax,1)");
+	asm volatile("bndstx %bnd0, 0x12345678(%eax,%ecx,1)");
+
+	/* bnd prefix on call, ret, jmp and all jcc */
+
+	asm volatile("bnd call label1");  /* Expecting: call unconditional 0xfffffffc */
+	asm volatile("bnd call *(%eax)"); /* Expecting: call indirect      0 */
+	asm volatile("bnd ret");          /* Expecting: ret  indirect      0 */
+	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0xfffffffc */
+	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0xfffffffc */
+	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
+	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0xfffffffc */
+
+#endif /* #ifndef __x86_64__ */
+
+	/* Following line is a marker for the awk script - do not change */
+	asm volatile("rdtsc"); /* Stop here */
+
+	return 0;
+}
diff --git a/tools/perf/tests/insn-x86.c b/tools/perf/tests/insn-x86.c
new file mode 100644
index 000000000000..0e126a099874
--- /dev/null
+++ b/tools/perf/tests/insn-x86.c
@@ -0,0 +1,180 @@
+#include <linux/types.h>
+
+#include "debug.h"
+#include "tests.h"
+
+#include "intel-pt-decoder/insn.h"
+#include "intel-pt-decoder/intel-pt-insn-decoder.h"
+
+struct test_data {
+	u8 data[MAX_INSN_SIZE];
+	int expected_length;
+	int expected_rel;
+	const char *expected_op_str;
+	const char *expected_branch_str;
+	const char *asm_rep;
+};
+
+struct test_data test_data_32[] = {
+#include "insn-x86-dat-32.c"
+	{{0}, 0, 0, NULL, NULL, NULL},
+};
+
+struct test_data test_data_64[] = {
+#include "insn-x86-dat-64.c"
+	{{0}, 0, 0, NULL, NULL, NULL},
+};
+
+static int get_op(const char *op_str)
+{
+	struct val_data {
+		const char *name;
+		int val;
+	} vals[] = {
+		{"other",   INTEL_PT_OP_OTHER},
+		{"call",    INTEL_PT_OP_CALL},
+		{"ret",     INTEL_PT_OP_RET},
+		{"jcc",     INTEL_PT_OP_JCC},
+		{"jmp",     INTEL_PT_OP_JMP},
+		{"loop",    INTEL_PT_OP_LOOP},
+		{"iret",    INTEL_PT_OP_IRET},
+		{"int",     INTEL_PT_OP_INT},
+		{"syscall", INTEL_PT_OP_SYSCALL},
+		{"sysret",  INTEL_PT_OP_SYSRET},
+		{NULL, 0},
+	};
+	struct val_data *val;
+
+	if (!op_str || !strlen(op_str))
+		return 0;
+
+	for (val = vals; val->name; val++) {
+		if (!strcmp(val->name, op_str))
+			return val->val;
+	}
+
+	pr_debug("Failed to get op\n");
+
+	return -1;
+}
+
+static int get_branch(const char *branch_str)
+{
+	struct val_data {
+		const char *name;
+		int val;
+	} vals[] = {
+		{"no_branch",     INTEL_PT_BR_NO_BRANCH},
+		{"indirect",      INTEL_PT_BR_INDIRECT},
+		{"conditional",   INTEL_PT_BR_CONDITIONAL},
+		{"unconditional", INTEL_PT_BR_UNCONDITIONAL},
+		{NULL, 0},
+	};
+	struct val_data *val;
+
+	if (!branch_str || !strlen(branch_str))
+		return 0;
+
+	for (val = vals; val->name; val++) {
+		if (!strcmp(val->name, branch_str))
+			return val->val;
+	}
+
+	pr_debug("Failed to get branch\n");
+
+	return -1;
+}
+
+static int test_data_item(struct test_data *dat, int x86_64)
+{
+	struct intel_pt_insn intel_pt_insn;
+	struct insn insn;
+	int op, branch;
+
+	insn_init(&insn, dat->data, MAX_INSN_SIZE, x86_64);
+	insn_get_length(&insn);
+
+	if (!insn_complete(&insn)) {
+		pr_debug("Failed to decode: %s\n", dat->asm_rep);
+		return -1;
+	}
+
+	if (insn.length != dat->expected_length) {
+		pr_debug("Failed to decode length (%d vs expected %d): %s\n",
+			 insn.length, dat->expected_length, dat->asm_rep);
+		return -1;
+	}
+
+	op = get_op(dat->expected_op_str);
+	branch = get_branch(dat->expected_branch_str);
+
+	if (intel_pt_get_insn(dat->data, MAX_INSN_SIZE, x86_64, &intel_pt_insn)) {
+		pr_debug("Intel PT failed to decode: %s\n", dat->asm_rep);
+		return -1;
+	}
+
+	if ((int)intel_pt_insn.op != op) {
+		pr_debug("Failed to decode 'op' value (%d vs expected %d): %s\n",
+			 intel_pt_insn.op, op, dat->asm_rep);
+		return -1;
+	}
+
+	if ((int)intel_pt_insn.branch != branch) {
+		pr_debug("Failed to decode 'branch' value (%d vs expected %d): %s\n",
+			 intel_pt_insn.branch, branch, dat->asm_rep);
+		return -1;
+	}
+
+	if (intel_pt_insn.rel != dat->expected_rel) {
+		pr_debug("Failed to decode 'rel' value (%#x vs expected %#x): %s\n",
+			 intel_pt_insn.rel, dat->expected_rel, dat->asm_rep);
+		return -1;
+	}
+
+	pr_debug("Decoded ok: %s\n", dat->asm_rep);
+
+	return 0;
+}
+
+static int test_data_set(struct test_data *dat_set, int x86_64)
+{
+	struct test_data *dat;
+	int ret = 0;
+
+	for (dat = dat_set; dat->expected_length; dat++) {
+		if (test_data_item(dat, x86_64))
+			ret = -1;
+	}
+
+	return ret;
+}
+
+/**
+ * test__insn_x86 - test x86 instruction decoder - new instructions.
+ *
+ * This function implements a test that decodes a selection of instructions and
+ * checks the results.  The Intel PT function that further categorizes
+ * instructions (i.e. intel_pt_get_insn()) is also checked.
+ *
+ * The instructions are originally in insn-x86-dat-src.c which has been
+ * processed by scripts gen-insn-x86-dat.sh and gen-insn-x86-dat.awk to produce
+ * insn-x86-dat-32.c and insn-x86-dat-64.c which are included into this program.
+ * i.e. to add new instructions to the test, edit insn-x86-dat-src.c, run the
+ * gen-insn-x86-dat.sh script, make perf, and then run the test.
+ *
+ * If the test passes %0 is returned, otherwise %-1 is returned.  Use the
+ * verbose (-v) option to see all the instructions and whether or not they
+ * decoded successfuly.
+ */
+int test__insn_x86(void)
+{
+	int ret = 0;
+
+	if (test_data_set(test_data_32, 0))
+		ret = -1;
+
+	if (test_data_set(test_data_64, 1))
+		ret = -1;
+
+	return ret;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index bf113a247987..4e2c5458269a 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -63,6 +63,7 @@ int test__fdarray__add(void);
 int test__kmod_path__parse(void);
 int test__thread_map(void);
 int test__llvm(void);
+int test__insn_x86(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/4] x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions
  2015-08-31 13:58 [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Adrian Hunter
  2015-08-31 13:58 ` [PATCH 1/4] perf tools: Add a test for decoding of " Adrian Hunter
@ 2015-08-31 13:58 ` Adrian Hunter
  2015-08-31 14:48   ` Arnaldo Carvalho de Melo
  2015-08-31 13:58 ` [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions Adrian Hunter
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Adrian Hunter @ 2015-08-31 13:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

The MPX instructions are presently not described in the SDM
opcode maps, and there are not encoding characters for bnd
registers, address method or operand type.  So the kernel
opcode map is using 'Gv' for bnd registers and 'Ev' for
everything else.  That is fine because the instruction
decoder does not use that information anyway, except as
an indication that there is a ModR/M byte.

Nevertheless, in some cases the 'Gv' and 'Ev' are the wrong
way around, BNDLDX and BNDSTX have 2 operands not 3, and it
wouldn't hurt to identify the mandatory prefixes.

This has no effect on the decoding of valid instructions,
but the addition of the mandatory prefixes will cause some
invalid instructions to error out that wouldn't have
previously.

Note that perf tools has a copy of the instruction decoder
and provides a test for new instructions which includes MPX
instructions e.g.

	$ perf test list 2>&1 | grep "x86 ins"
	39: Test x86 instruction decoder - new instructions
	$ perf test 39
	39: Test x86 instruction decoder - new instructions          : Ok

Or to see the details:

	$ perf test -v 39

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/lib/x86-opcode-map.txt                     | 8 ++++++--
 tools/perf/util/intel-pt-decoder/x86-opcode-map.txt | 8 ++++++--
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index 816488c0b97e..a02a195d219c 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -353,8 +353,12 @@ AVXcode: 1
 17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1)
 18: Grp16 (1A)
 19:
-1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv
-1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv
+# Intel SDM opcode map does not list MPX instructions. For now using Gv for
+# bnd registers and Ev for everything else is OK because the instruction
+# decoder does not use the information except as an indication that there is
+# a ModR/M byte.
+1a: BNDCL Gv,Ev (F3) | BNDCU Gv,Ev (F2) | BNDMOV Gv,Ev (66) | BNDLDX Gv,Ev
+1b: BNDCN Gv,Ev (F2) | BNDMOV Ev,Gv (66) | BNDMK Gv,Ev (F3) | BNDSTX Ev,Gv
 1c:
 1d:
 1e:
diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
index 816488c0b97e..a02a195d219c 100644
--- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
+++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
@@ -353,8 +353,12 @@ AVXcode: 1
 17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1)
 18: Grp16 (1A)
 19:
-1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv
-1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv
+# Intel SDM opcode map does not list MPX instructions. For now using Gv for
+# bnd registers and Ev for everything else is OK because the instruction
+# decoder does not use the information except as an indication that there is
+# a ModR/M byte.
+1a: BNDCL Gv,Ev (F3) | BNDCU Gv,Ev (F2) | BNDMOV Gv,Ev (66) | BNDLDX Gv,Ev
+1b: BNDCN Gv,Ev (F2) | BNDMOV Ev,Gv (66) | BNDMK Gv,Ev (F3) | BNDSTX Ev,Gv
 1c:
 1d:
 1e:
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions
  2015-08-31 13:58 [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Adrian Hunter
  2015-08-31 13:58 ` [PATCH 1/4] perf tools: Add a test for decoding of " Adrian Hunter
  2015-08-31 13:58 ` [PATCH 2/4] x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions Adrian Hunter
@ 2015-08-31 13:58 ` Adrian Hunter
  2015-08-31 14:50   ` Arnaldo Carvalho de Melo
  2015-09-01  0:08   ` 平松雅巳 / HIRAMATU,MASAMI
  2015-08-31 13:58 ` [PATCH 4/4] x86/insn: perf tools: Add new memory instructions Adrian Hunter
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 27+ messages in thread
From: Adrian Hunter @ 2015-08-31 13:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Intel SHA Extensions are explained in the Intel Architecture
Instruction Set Extensions Programing Reference (Oct 2014).
There are 7 new instructions.  Add them to the op code map
and the perf tools new instructions test. e.g.

    $ tools/perf/perf test list 2>&1 | grep "x86 ins"
    39: Test x86 instruction decoder - new instructions
    $ tools/perf/perf test 39
    39: Test x86 instruction decoder - new instructions          : Ok

Or to see the details:

    $ tools/perf/perf test -v 39 2>&1 | grep sha

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/lib/x86-opcode-map.txt                    |   7 +
 tools/perf/tests/insn-x86-dat-32.c                 | 294 ++++++++++++++++
 tools/perf/tests/insn-x86-dat-64.c                 | 364 ++++++++++++++++++++
 tools/perf/tests/insn-x86-dat-src.c                | 373 +++++++++++++++++++++
 .../perf/util/intel-pt-decoder/x86-opcode-map.txt  |   7 +
 5 files changed, 1045 insertions(+)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index a02a195d219c..25dad388b371 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
 be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
 bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
 # 0x0f 0x38 0xc0-0xff
+c8: sha1nexte Vdq,Wdq
+c9: sha1msg1 Vdq,Wdq
+ca: sha1msg2 Vdq,Wdq
+cb: sha256rnds2 Vdq,Wdq
+cc: sha256msg1 Vdq,Wdq
+cd: sha256msg2 Vdq,Wdq
 db: VAESIMC Vdq,Wdq (66),(v1)
 dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
 dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
@@ -794,6 +800,7 @@ AVXcode: 3
 61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
 62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
 63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
+cc: sha1rnds4 Vdq,Wdq,Ib
 df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
 f0: RORX Gy,Ey,Ib (F2),(v)
 EndTable
diff --git a/tools/perf/tests/insn-x86-dat-32.c b/tools/perf/tests/insn-x86-dat-32.c
index 6a38a34a5a49..83f5078e74e1 100644
--- a/tools/perf/tests/insn-x86-dat-32.c
+++ b/tools/perf/tests/insn-x86-dat-32.c
@@ -322,3 +322,297 @@
 "f2 ff 21             \tbnd jmp *(%ecx)",},
 {{0xf2, 0x0f, 0x85, 0xfc, 0xff, 0xff, 0xff, }, 7, 0xfffffffc, "jcc", "conditional",
 "f2 0f 85 fc ff ff ff \tbnd jne 3de <main+0x3de>",},
+{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
+"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
+"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
+{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
+"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%eax),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
+"0f 3a cc 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
+"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%eax),%xmm3",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
+"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%eax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
+"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
+"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
+"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%eax),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
+"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%ebp),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
+"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
+"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
+"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
+"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
+{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
+"0f 38 c8 00          \tsha1nexte (%eax),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c8 05 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
+"0f 38 c8 18          \tsha1nexte (%eax),%xmm3",},
+{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 c8 04 01       \tsha1nexte (%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 c8 04 08       \tsha1nexte (%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 c8 04 c8       \tsha1nexte (%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 c8 40 12       \tsha1nexte 0x12(%eax),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 c8 45 12       \tsha1nexte 0x12(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 01 12    \tsha1nexte 0x12(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 05 12    \tsha1nexte 0x12(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 08 12    \tsha1nexte 0x12(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%eax),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
+"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
+"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
+{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
+"0f 38 c9 00          \tsha1msg1 (%eax),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c9 05 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
+"0f 38 c9 18          \tsha1msg1 (%eax),%xmm3",},
+{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 c9 04 01       \tsha1msg1 (%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 c9 04 08       \tsha1msg1 (%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 c9 04 c8       \tsha1msg1 (%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 c9 40 12       \tsha1msg1 0x12(%eax),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 c9 45 12       \tsha1msg1 0x12(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 01 12    \tsha1msg1 0x12(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 05 12    \tsha1msg1 0x12(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 08 12    \tsha1msg1 0x12(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%eax),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
+"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
+"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
+{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
+"0f 38 ca 00          \tsha1msg2 (%eax),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 ca 05 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
+"0f 38 ca 18          \tsha1msg2 (%eax),%xmm3",},
+{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 ca 04 01       \tsha1msg2 (%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 ca 04 08       \tsha1msg2 (%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 ca 04 c8       \tsha1msg2 (%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 ca 40 12       \tsha1msg2 0x12(%eax),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 ca 45 12       \tsha1msg2 0x12(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 01 12    \tsha1msg2 0x12(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 05 12    \tsha1msg2 0x12(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 08 12    \tsha1msg2 0x12(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%eax),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
+"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
+{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
+"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
+{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
+"0f 38 cb 08          \tsha256rnds2 %xmm0,(%eax),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cb 0d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
+"0f 38 cb 18          \tsha256rnds2 %xmm0,(%eax),%xmm3",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
+"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%ecx,%eax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%eax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
+"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%eax,%ecx,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
+"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%eax,%ecx,8),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
+"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%eax),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
+"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%ebp),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%ecx,%eax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%ebp,%eax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,8),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ecx,%eax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp,%eax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,8),%xmm1",},
+{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
+"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
+"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
+{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
+"0f 38 cc 00          \tsha256msg1 (%eax),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cc 05 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
+"0f 38 cc 18          \tsha256msg1 (%eax),%xmm3",},
+{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 cc 04 01       \tsha256msg1 (%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 cc 04 08       \tsha256msg1 (%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 cc 04 c8       \tsha256msg1 (%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 cc 40 12       \tsha256msg1 0x12(%eax),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 cc 45 12       \tsha256msg1 0x12(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 01 12    \tsha256msg1 0x12(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 05 12    \tsha256msg1 0x12(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 08 12    \tsha256msg1 0x12(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%eax),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
+"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
+"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
+{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
+"0f 38 cd 00          \tsha256msg2 (%eax),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cd 05 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
+"0f 38 cd 18          \tsha256msg2 (%eax),%xmm3",},
+{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 cd 04 01       \tsha256msg2 (%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 cd 04 08       \tsha256msg2 (%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 cd 04 c8       \tsha256msg2 (%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 cd 40 12       \tsha256msg2 0x12(%eax),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 cd 45 12       \tsha256msg2 0x12(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 01 12    \tsha256msg2 0x12(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 05 12    \tsha256msg2 0x12(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 08 12    \tsha256msg2 0x12(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%eax,%ecx,8),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%eax),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%ebp),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%ecx,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%ebp,%eax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,8),%xmm0",},
diff --git a/tools/perf/tests/insn-x86-dat-64.c b/tools/perf/tests/insn-x86-dat-64.c
index 01122421a776..13f008588590 100644
--- a/tools/perf/tests/insn-x86-dat-64.c
+++ b/tools/perf/tests/insn-x86-dat-64.c
@@ -338,3 +338,367 @@
 "67 f2 ff 21          \tbnd jmpq *(%ecx)",},
 {{0xf2, 0x0f, 0x85, 0x00, 0x00, 0x00, 0x00, }, 7, 0, "jcc", "conditional",
 "f2 0f 85 00 00 00 00 \tbnd jne 413 <main+0x413>",},
+{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
+"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
+"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
+{{0x41, 0x0f, 0x3a, 0xcc, 0xc0, 0x91, }, 6, 0, "", "",
+"41 0f 3a cc c0 91    \tsha1rnds4 $0x91,%xmm8,%xmm0",},
+{{0x44, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
+"44 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm7,%xmm8",},
+{{0x45, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
+"45 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm15,%xmm8",},
+{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
+"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%rax),%xmm0",},
+{{0x41, 0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 6, 0, "", "",
+"41 0f 3a cc 00 91    \tsha1rnds4 $0x91,(%r8),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 04 25 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
+"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%rax),%xmm3",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
+"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%rax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
+"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
+"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
+"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%rax),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
+"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%rbp),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
+"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
+"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
+"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
+"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm0",},
+{{0x44, 0x0f, 0x3a, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 11, 0, "", "",
+"44 0f 3a cc bc c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm15",},
+{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
+"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
+"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
+{{0x41, 0x0f, 0x38, 0xc8, 0xc0, }, 5, 0, "", "",
+"41 0f 38 c8 c0       \tsha1nexte %xmm8,%xmm0",},
+{{0x44, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
+"44 0f 38 c8 c7       \tsha1nexte %xmm7,%xmm8",},
+{{0x45, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
+"45 0f 38 c8 c7       \tsha1nexte %xmm15,%xmm8",},
+{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
+"0f 38 c8 00          \tsha1nexte (%rax),%xmm0",},
+{{0x41, 0x0f, 0x38, 0xc8, 0x00, }, 5, 0, "", "",
+"41 0f 38 c8 00       \tsha1nexte (%r8),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 04 25 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
+"0f 38 c8 18          \tsha1nexte (%rax),%xmm3",},
+{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 c8 04 01       \tsha1nexte (%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 c8 04 08       \tsha1nexte (%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 c8 04 c8       \tsha1nexte (%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 c8 40 12       \tsha1nexte 0x12(%rax),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 c8 45 12       \tsha1nexte 0x12(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 01 12    \tsha1nexte 0x12(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 05 12    \tsha1nexte 0x12(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 08 12    \tsha1nexte 0x12(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%rax),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm0",},
+{{0x44, 0x0f, 0x38, 0xc8, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"44 0f 38 c8 bc c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm15",},
+{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
+"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
+"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
+{{0x41, 0x0f, 0x38, 0xc9, 0xc0, }, 5, 0, "", "",
+"41 0f 38 c9 c0       \tsha1msg1 %xmm8,%xmm0",},
+{{0x44, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
+"44 0f 38 c9 c7       \tsha1msg1 %xmm7,%xmm8",},
+{{0x45, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
+"45 0f 38 c9 c7       \tsha1msg1 %xmm15,%xmm8",},
+{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
+"0f 38 c9 00          \tsha1msg1 (%rax),%xmm0",},
+{{0x41, 0x0f, 0x38, 0xc9, 0x00, }, 5, 0, "", "",
+"41 0f 38 c9 00       \tsha1msg1 (%r8),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 04 25 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
+"0f 38 c9 18          \tsha1msg1 (%rax),%xmm3",},
+{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 c9 04 01       \tsha1msg1 (%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 c9 04 08       \tsha1msg1 (%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 c9 04 c8       \tsha1msg1 (%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 c9 40 12       \tsha1msg1 0x12(%rax),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 c9 45 12       \tsha1msg1 0x12(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 01 12    \tsha1msg1 0x12(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 05 12    \tsha1msg1 0x12(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 08 12    \tsha1msg1 0x12(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%rax),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm0",},
+{{0x44, 0x0f, 0x38, 0xc9, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"44 0f 38 c9 bc c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm15",},
+{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
+"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
+"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
+{{0x41, 0x0f, 0x38, 0xca, 0xc0, }, 5, 0, "", "",
+"41 0f 38 ca c0       \tsha1msg2 %xmm8,%xmm0",},
+{{0x44, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
+"44 0f 38 ca c7       \tsha1msg2 %xmm7,%xmm8",},
+{{0x45, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
+"45 0f 38 ca c7       \tsha1msg2 %xmm15,%xmm8",},
+{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
+"0f 38 ca 00          \tsha1msg2 (%rax),%xmm0",},
+{{0x41, 0x0f, 0x38, 0xca, 0x00, }, 5, 0, "", "",
+"41 0f 38 ca 00       \tsha1msg2 (%r8),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 04 25 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
+"0f 38 ca 18          \tsha1msg2 (%rax),%xmm3",},
+{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 ca 04 01       \tsha1msg2 (%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 ca 04 08       \tsha1msg2 (%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 ca 04 c8       \tsha1msg2 (%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 ca 40 12       \tsha1msg2 0x12(%rax),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 ca 45 12       \tsha1msg2 0x12(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 01 12    \tsha1msg2 0x12(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 05 12    \tsha1msg2 0x12(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 08 12    \tsha1msg2 0x12(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%rax),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm0",},
+{{0x44, 0x0f, 0x38, 0xca, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"44 0f 38 ca bc c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm15",},
+{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
+"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
+{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
+"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
+{{0x41, 0x0f, 0x38, 0xcb, 0xc8, }, 5, 0, "", "",
+"41 0f 38 cb c8       \tsha256rnds2 %xmm0,%xmm8,%xmm1",},
+{{0x44, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
+"44 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm7,%xmm8",},
+{{0x45, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
+"45 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm15,%xmm8",},
+{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
+"0f 38 cb 08          \tsha256rnds2 %xmm0,(%rax),%xmm1",},
+{{0x41, 0x0f, 0x38, 0xcb, 0x08, }, 5, 0, "", "",
+"41 0f 38 cb 08       \tsha256rnds2 %xmm0,(%r8),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 0c 25 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
+"0f 38 cb 18          \tsha256rnds2 %xmm0,(%rax),%xmm3",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
+"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%rcx,%rax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%rax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
+"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%rax,%rcx,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
+"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%rax,%rcx,8),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
+"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%rax),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
+"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%rbp),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%rcx,%rax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%rbp,%rax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,8),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rcx,%rax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp,%rax,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,1),%xmm1",},
+{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm1",},
+{{0x44, 0x0f, 0x38, 0xcb, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"44 0f 38 cb bc c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm15",},
+{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
+"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
+"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
+{{0x41, 0x0f, 0x38, 0xcc, 0xc0, }, 5, 0, "", "",
+"41 0f 38 cc c0       \tsha256msg1 %xmm8,%xmm0",},
+{{0x44, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
+"44 0f 38 cc c7       \tsha256msg1 %xmm7,%xmm8",},
+{{0x45, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
+"45 0f 38 cc c7       \tsha256msg1 %xmm15,%xmm8",},
+{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
+"0f 38 cc 00          \tsha256msg1 (%rax),%xmm0",},
+{{0x41, 0x0f, 0x38, 0xcc, 0x00, }, 5, 0, "", "",
+"41 0f 38 cc 00       \tsha256msg1 (%r8),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 04 25 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
+"0f 38 cc 18          \tsha256msg1 (%rax),%xmm3",},
+{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 cc 04 01       \tsha256msg1 (%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 cc 04 08       \tsha256msg1 (%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 cc 04 c8       \tsha256msg1 (%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 cc 40 12       \tsha256msg1 0x12(%rax),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 cc 45 12       \tsha256msg1 0x12(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 01 12    \tsha256msg1 0x12(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 05 12    \tsha256msg1 0x12(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 08 12    \tsha256msg1 0x12(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%rax),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm0",},
+{{0x44, 0x0f, 0x38, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"44 0f 38 cc bc c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm15",},
+{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
+"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
+{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
+"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
+{{0x41, 0x0f, 0x38, 0xcd, 0xc0, }, 5, 0, "", "",
+"41 0f 38 cd c0       \tsha256msg2 %xmm8,%xmm0",},
+{{0x44, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
+"44 0f 38 cd c7       \tsha256msg2 %xmm7,%xmm8",},
+{{0x45, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
+"45 0f 38 cd c7       \tsha256msg2 %xmm15,%xmm8",},
+{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
+"0f 38 cd 00          \tsha256msg2 (%rax),%xmm0",},
+{{0x41, 0x0f, 0x38, 0xcd, 0x00, }, 5, 0, "", "",
+"41 0f 38 cd 00       \tsha256msg2 (%r8),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 04 25 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
+"0f 38 cd 18          \tsha256msg2 (%rax),%xmm3",},
+{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
+"0f 38 cd 04 01       \tsha256msg2 (%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
+"0f 38 cd 04 08       \tsha256msg2 (%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
+"0f 38 cd 04 c8       \tsha256msg2 (%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
+"0f 38 cd 40 12       \tsha256msg2 0x12(%rax),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
+"0f 38 cd 45 12       \tsha256msg2 0x12(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 01 12    \tsha256msg2 0x12(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 05 12    \tsha256msg2 0x12(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 08 12    \tsha256msg2 0x12(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
+"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%rax,%rcx,8),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%rax),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%rbp),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%rcx,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%rbp,%rax,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,1),%xmm0",},
+{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm0",},
+{{0x44, 0x0f, 0x38, 0xcd, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"44 0f 38 cd bc c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm15",},
diff --git a/tools/perf/tests/insn-x86-dat-src.c b/tools/perf/tests/insn-x86-dat-src.c
index b506830f33a8..7d06c9b22070 100644
--- a/tools/perf/tests/insn-x86-dat-src.c
+++ b/tools/perf/tests/insn-x86-dat-src.c
@@ -217,6 +217,210 @@ int main(void)
 	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
 	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0 */
 
+	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
+
+	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
+	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
+	asm volatile("sha1rnds4 $0x91, %xmm8, %xmm0");
+	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm8");
+	asm volatile("sha1rnds4 $0x91, %xmm15, %xmm8");
+	asm volatile("sha1rnds4 $0x91, (%rax), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (%r8), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (%rax), %xmm3");
+	asm volatile("sha1rnds4 $0x91, (%rcx,%rax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(,%rax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,8), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%rax), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%rbp), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm15");
+
+	/* sha1nexte xmm2/m128, xmm1 */
+
+	asm volatile("sha1nexte %xmm1, %xmm0");
+	asm volatile("sha1nexte %xmm7, %xmm2");
+	asm volatile("sha1nexte %xmm8, %xmm0");
+	asm volatile("sha1nexte %xmm7, %xmm8");
+	asm volatile("sha1nexte %xmm15, %xmm8");
+	asm volatile("sha1nexte (%rax), %xmm0");
+	asm volatile("sha1nexte (%r8), %xmm0");
+	asm volatile("sha1nexte (0x12345678), %xmm0");
+	asm volatile("sha1nexte (%rax), %xmm3");
+	asm volatile("sha1nexte (%rcx,%rax,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(,%rax,1), %xmm0");
+	asm volatile("sha1nexte (%rax,%rcx,1), %xmm0");
+	asm volatile("sha1nexte (%rax,%rcx,8), %xmm0");
+	asm volatile("sha1nexte 0x12(%rax), %xmm0");
+	asm volatile("sha1nexte 0x12(%rbp), %xmm0");
+	asm volatile("sha1nexte 0x12(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1nexte 0x12(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1nexte 0x12(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1nexte 0x12(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%rax), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%rbp), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm15");
+
+	/* sha1msg1 xmm2/m128, xmm1 */
+
+	asm volatile("sha1msg1 %xmm1, %xmm0");
+	asm volatile("sha1msg1 %xmm7, %xmm2");
+	asm volatile("sha1msg1 %xmm8, %xmm0");
+	asm volatile("sha1msg1 %xmm7, %xmm8");
+	asm volatile("sha1msg1 %xmm15, %xmm8");
+	asm volatile("sha1msg1 (%rax), %xmm0");
+	asm volatile("sha1msg1 (%r8), %xmm0");
+	asm volatile("sha1msg1 (0x12345678), %xmm0");
+	asm volatile("sha1msg1 (%rax), %xmm3");
+	asm volatile("sha1msg1 (%rcx,%rax,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(,%rax,1), %xmm0");
+	asm volatile("sha1msg1 (%rax,%rcx,1), %xmm0");
+	asm volatile("sha1msg1 (%rax,%rcx,8), %xmm0");
+	asm volatile("sha1msg1 0x12(%rax), %xmm0");
+	asm volatile("sha1msg1 0x12(%rbp), %xmm0");
+	asm volatile("sha1msg1 0x12(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1msg1 0x12(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1msg1 0x12(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1msg1 0x12(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%rax), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%rbp), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm15");
+
+	/* sha1msg2 xmm2/m128, xmm1 */
+
+	asm volatile("sha1msg2 %xmm1, %xmm0");
+	asm volatile("sha1msg2 %xmm7, %xmm2");
+	asm volatile("sha1msg2 %xmm8, %xmm0");
+	asm volatile("sha1msg2 %xmm7, %xmm8");
+	asm volatile("sha1msg2 %xmm15, %xmm8");
+	asm volatile("sha1msg2 (%rax), %xmm0");
+	asm volatile("sha1msg2 (%r8), %xmm0");
+	asm volatile("sha1msg2 (0x12345678), %xmm0");
+	asm volatile("sha1msg2 (%rax), %xmm3");
+	asm volatile("sha1msg2 (%rcx,%rax,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(,%rax,1), %xmm0");
+	asm volatile("sha1msg2 (%rax,%rcx,1), %xmm0");
+	asm volatile("sha1msg2 (%rax,%rcx,8), %xmm0");
+	asm volatile("sha1msg2 0x12(%rax), %xmm0");
+	asm volatile("sha1msg2 0x12(%rbp), %xmm0");
+	asm volatile("sha1msg2 0x12(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1msg2 0x12(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1msg2 0x12(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1msg2 0x12(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%rax), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%rbp), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%rcx,%rax,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%rbp,%rax,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%rax,%rcx,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm15");
+
+	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
+	/* Note sha256rnds2 has an implicit operand 'xmm0' */
+
+	asm volatile("sha256rnds2 %xmm4, %xmm1");
+	asm volatile("sha256rnds2 %xmm7, %xmm2");
+	asm volatile("sha256rnds2 %xmm8, %xmm1");
+	asm volatile("sha256rnds2 %xmm7, %xmm8");
+	asm volatile("sha256rnds2 %xmm15, %xmm8");
+	asm volatile("sha256rnds2 (%rax), %xmm1");
+	asm volatile("sha256rnds2 (%r8), %xmm1");
+	asm volatile("sha256rnds2 (0x12345678), %xmm1");
+	asm volatile("sha256rnds2 (%rax), %xmm3");
+	asm volatile("sha256rnds2 (%rcx,%rax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(,%rax,1), %xmm1");
+	asm volatile("sha256rnds2 (%rax,%rcx,1), %xmm1");
+	asm volatile("sha256rnds2 (%rax,%rcx,8), %xmm1");
+	asm volatile("sha256rnds2 0x12(%rax), %xmm1");
+	asm volatile("sha256rnds2 0x12(%rbp), %xmm1");
+	asm volatile("sha256rnds2 0x12(%rcx,%rax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12(%rbp,%rax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12(%rax,%rcx,1), %xmm1");
+	asm volatile("sha256rnds2 0x12(%rax,%rcx,8), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%rax), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%rbp), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%rcx,%rax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%rbp,%rax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm15");
+
+	/* sha256msg1 xmm2/m128, xmm1 */
+
+	asm volatile("sha256msg1 %xmm1, %xmm0");
+	asm volatile("sha256msg1 %xmm7, %xmm2");
+	asm volatile("sha256msg1 %xmm8, %xmm0");
+	asm volatile("sha256msg1 %xmm7, %xmm8");
+	asm volatile("sha256msg1 %xmm15, %xmm8");
+	asm volatile("sha256msg1 (%rax), %xmm0");
+	asm volatile("sha256msg1 (%r8), %xmm0");
+	asm volatile("sha256msg1 (0x12345678), %xmm0");
+	asm volatile("sha256msg1 (%rax), %xmm3");
+	asm volatile("sha256msg1 (%rcx,%rax,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(,%rax,1), %xmm0");
+	asm volatile("sha256msg1 (%rax,%rcx,1), %xmm0");
+	asm volatile("sha256msg1 (%rax,%rcx,8), %xmm0");
+	asm volatile("sha256msg1 0x12(%rax), %xmm0");
+	asm volatile("sha256msg1 0x12(%rbp), %xmm0");
+	asm volatile("sha256msg1 0x12(%rcx,%rax,1), %xmm0");
+	asm volatile("sha256msg1 0x12(%rbp,%rax,1), %xmm0");
+	asm volatile("sha256msg1 0x12(%rax,%rcx,1), %xmm0");
+	asm volatile("sha256msg1 0x12(%rax,%rcx,8), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%rax), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%rbp), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%rcx,%rax,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%rbp,%rax,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%rax,%rcx,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm15");
+
+	/* sha256msg2 xmm2/m128, xmm1 */
+
+	asm volatile("sha256msg2 %xmm1, %xmm0");
+	asm volatile("sha256msg2 %xmm7, %xmm2");
+	asm volatile("sha256msg2 %xmm8, %xmm0");
+	asm volatile("sha256msg2 %xmm7, %xmm8");
+	asm volatile("sha256msg2 %xmm15, %xmm8");
+	asm volatile("sha256msg2 (%rax), %xmm0");
+	asm volatile("sha256msg2 (%r8), %xmm0");
+	asm volatile("sha256msg2 (0x12345678), %xmm0");
+	asm volatile("sha256msg2 (%rax), %xmm3");
+	asm volatile("sha256msg2 (%rcx,%rax,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(,%rax,1), %xmm0");
+	asm volatile("sha256msg2 (%rax,%rcx,1), %xmm0");
+	asm volatile("sha256msg2 (%rax,%rcx,8), %xmm0");
+	asm volatile("sha256msg2 0x12(%rax), %xmm0");
+	asm volatile("sha256msg2 0x12(%rbp), %xmm0");
+	asm volatile("sha256msg2 0x12(%rcx,%rax,1), %xmm0");
+	asm volatile("sha256msg2 0x12(%rbp,%rax,1), %xmm0");
+	asm volatile("sha256msg2 0x12(%rax,%rcx,1), %xmm0");
+	asm volatile("sha256msg2 0x12(%rax,%rcx,8), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%rax), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%rbp), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%rcx,%rax,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%rbp,%rax,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%rax,%rcx,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm15");
+
 #else  /* #ifdef __x86_64__ */
 
 	/* bndmk m32, bnd */
@@ -407,6 +611,175 @@ int main(void)
 	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
 	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0xfffffffc */
 
+	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
+
+	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
+	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
+	asm volatile("sha1rnds4 $0x91, (%eax), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (%eax), %xmm3");
+	asm volatile("sha1rnds4 $0x91, (%ecx,%eax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(,%eax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,8), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%eax), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%ebp), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,8), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,8), %xmm0");
+
+	/* sha1nexte xmm2/m128, xmm1 */
+
+	asm volatile("sha1nexte %xmm1, %xmm0");
+	asm volatile("sha1nexte %xmm7, %xmm2");
+	asm volatile("sha1nexte (%eax), %xmm0");
+	asm volatile("sha1nexte (0x12345678), %xmm0");
+	asm volatile("sha1nexte (%eax), %xmm3");
+	asm volatile("sha1nexte (%ecx,%eax,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(,%eax,1), %xmm0");
+	asm volatile("sha1nexte (%eax,%ecx,1), %xmm0");
+	asm volatile("sha1nexte (%eax,%ecx,8), %xmm0");
+	asm volatile("sha1nexte 0x12(%eax), %xmm0");
+	asm volatile("sha1nexte 0x12(%ebp), %xmm0");
+	asm volatile("sha1nexte 0x12(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1nexte 0x12(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1nexte 0x12(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1nexte 0x12(%eax,%ecx,8), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%eax), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%ebp), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1nexte 0x12345678(%eax,%ecx,8), %xmm0");
+
+	/* sha1msg1 xmm2/m128, xmm1 */
+
+	asm volatile("sha1msg1 %xmm1, %xmm0");
+	asm volatile("sha1msg1 %xmm7, %xmm2");
+	asm volatile("sha1msg1 (%eax), %xmm0");
+	asm volatile("sha1msg1 (0x12345678), %xmm0");
+	asm volatile("sha1msg1 (%eax), %xmm3");
+	asm volatile("sha1msg1 (%ecx,%eax,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(,%eax,1), %xmm0");
+	asm volatile("sha1msg1 (%eax,%ecx,1), %xmm0");
+	asm volatile("sha1msg1 (%eax,%ecx,8), %xmm0");
+	asm volatile("sha1msg1 0x12(%eax), %xmm0");
+	asm volatile("sha1msg1 0x12(%ebp), %xmm0");
+	asm volatile("sha1msg1 0x12(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1msg1 0x12(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1msg1 0x12(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1msg1 0x12(%eax,%ecx,8), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%eax), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%ebp), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1msg1 0x12345678(%eax,%ecx,8), %xmm0");
+
+	/* sha1msg2 xmm2/m128, xmm1 */
+
+	asm volatile("sha1msg2 %xmm1, %xmm0");
+	asm volatile("sha1msg2 %xmm7, %xmm2");
+	asm volatile("sha1msg2 (%eax), %xmm0");
+	asm volatile("sha1msg2 (0x12345678), %xmm0");
+	asm volatile("sha1msg2 (%eax), %xmm3");
+	asm volatile("sha1msg2 (%ecx,%eax,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(,%eax,1), %xmm0");
+	asm volatile("sha1msg2 (%eax,%ecx,1), %xmm0");
+	asm volatile("sha1msg2 (%eax,%ecx,8), %xmm0");
+	asm volatile("sha1msg2 0x12(%eax), %xmm0");
+	asm volatile("sha1msg2 0x12(%ebp), %xmm0");
+	asm volatile("sha1msg2 0x12(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1msg2 0x12(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1msg2 0x12(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1msg2 0x12(%eax,%ecx,8), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%eax), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%ebp), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%ecx,%eax,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%ebp,%eax,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%eax,%ecx,1), %xmm0");
+	asm volatile("sha1msg2 0x12345678(%eax,%ecx,8), %xmm0");
+
+	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
+	/* Note sha256rnds2 has an implicit operand 'xmm0' */
+
+	asm volatile("sha256rnds2 %xmm4, %xmm1");
+	asm volatile("sha256rnds2 %xmm7, %xmm2");
+	asm volatile("sha256rnds2 (%eax), %xmm1");
+	asm volatile("sha256rnds2 (0x12345678), %xmm1");
+	asm volatile("sha256rnds2 (%eax), %xmm3");
+	asm volatile("sha256rnds2 (%ecx,%eax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(,%eax,1), %xmm1");
+	asm volatile("sha256rnds2 (%eax,%ecx,1), %xmm1");
+	asm volatile("sha256rnds2 (%eax,%ecx,8), %xmm1");
+	asm volatile("sha256rnds2 0x12(%eax), %xmm1");
+	asm volatile("sha256rnds2 0x12(%ebp), %xmm1");
+	asm volatile("sha256rnds2 0x12(%ecx,%eax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12(%ebp,%eax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12(%eax,%ecx,1), %xmm1");
+	asm volatile("sha256rnds2 0x12(%eax,%ecx,8), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%eax), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%ebp), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%ecx,%eax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%ebp,%eax,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,1), %xmm1");
+	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,8), %xmm1");
+
+	/* sha256msg1 xmm2/m128, xmm1 */
+
+	asm volatile("sha256msg1 %xmm1, %xmm0");
+	asm volatile("sha256msg1 %xmm7, %xmm2");
+	asm volatile("sha256msg1 (%eax), %xmm0");
+	asm volatile("sha256msg1 (0x12345678), %xmm0");
+	asm volatile("sha256msg1 (%eax), %xmm3");
+	asm volatile("sha256msg1 (%ecx,%eax,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(,%eax,1), %xmm0");
+	asm volatile("sha256msg1 (%eax,%ecx,1), %xmm0");
+	asm volatile("sha256msg1 (%eax,%ecx,8), %xmm0");
+	asm volatile("sha256msg1 0x12(%eax), %xmm0");
+	asm volatile("sha256msg1 0x12(%ebp), %xmm0");
+	asm volatile("sha256msg1 0x12(%ecx,%eax,1), %xmm0");
+	asm volatile("sha256msg1 0x12(%ebp,%eax,1), %xmm0");
+	asm volatile("sha256msg1 0x12(%eax,%ecx,1), %xmm0");
+	asm volatile("sha256msg1 0x12(%eax,%ecx,8), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%eax), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%ebp), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%ecx,%eax,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%ebp,%eax,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%eax,%ecx,1), %xmm0");
+	asm volatile("sha256msg1 0x12345678(%eax,%ecx,8), %xmm0");
+
+	/* sha256msg2 xmm2/m128, xmm1 */
+
+	asm volatile("sha256msg2 %xmm1, %xmm0");
+	asm volatile("sha256msg2 %xmm7, %xmm2");
+	asm volatile("sha256msg2 (%eax), %xmm0");
+	asm volatile("sha256msg2 (0x12345678), %xmm0");
+	asm volatile("sha256msg2 (%eax), %xmm3");
+	asm volatile("sha256msg2 (%ecx,%eax,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(,%eax,1), %xmm0");
+	asm volatile("sha256msg2 (%eax,%ecx,1), %xmm0");
+	asm volatile("sha256msg2 (%eax,%ecx,8), %xmm0");
+	asm volatile("sha256msg2 0x12(%eax), %xmm0");
+	asm volatile("sha256msg2 0x12(%ebp), %xmm0");
+	asm volatile("sha256msg2 0x12(%ecx,%eax,1), %xmm0");
+	asm volatile("sha256msg2 0x12(%ebp,%eax,1), %xmm0");
+	asm volatile("sha256msg2 0x12(%eax,%ecx,1), %xmm0");
+	asm volatile("sha256msg2 0x12(%eax,%ecx,8), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%eax), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%ebp), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%ecx,%eax,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%ebp,%eax,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%eax,%ecx,1), %xmm0");
+	asm volatile("sha256msg2 0x12345678(%eax,%ecx,8), %xmm0");
+
 #endif /* #ifndef __x86_64__ */
 
 	/* Following line is a marker for the awk script - do not change */
diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
index a02a195d219c..25dad388b371 100644
--- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
+++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
@@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
 be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
 bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
 # 0x0f 0x38 0xc0-0xff
+c8: sha1nexte Vdq,Wdq
+c9: sha1msg1 Vdq,Wdq
+ca: sha1msg2 Vdq,Wdq
+cb: sha256rnds2 Vdq,Wdq
+cc: sha256msg1 Vdq,Wdq
+cd: sha256msg2 Vdq,Wdq
 db: VAESIMC Vdq,Wdq (66),(v1)
 dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
 dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
@@ -794,6 +800,7 @@ AVXcode: 3
 61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
 62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
 63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
+cc: sha1rnds4 Vdq,Wdq,Ib
 df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
 f0: RORX Gy,Ey,Ib (F2),(v)
 EndTable
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/4] x86/insn: perf tools: Add new memory instructions
  2015-08-31 13:58 [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Adrian Hunter
                   ` (2 preceding siblings ...)
  2015-08-31 13:58 ` [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions Adrian Hunter
@ 2015-08-31 13:58 ` Adrian Hunter
  2015-08-31 14:43 ` [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Arnaldo Carvalho de Melo
  2015-09-01  8:54 ` Ingo Molnar
  5 siblings, 0 replies; 27+ messages in thread
From: Adrian Hunter @ 2015-08-31 13:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Intel Architecture Instruction Set Extensions Programing
Reference (Oct 2014) describes 3 new memory instructions,
namely clflushopt, clwb and pcommit.  Add them to the op
code map and the perf tools new instructions test. e.g.

    $ tools/perf/perf test list 2>&1 | grep "x86 ins"
    39: Test x86 instruction decoder - new instructions
    $ tools/perf/perf test 39
    39: Test x86 instruction decoder - new instructions          : Ok

Or to see the details:

    $ tools/perf/perf test -v 39

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/lib/x86-opcode-map.txt                    |  4 +-
 tools/perf/tests/insn-x86-dat-32.c                 | 22 +++++++++++
 tools/perf/tests/insn-x86-dat-64.c                 | 34 ++++++++++++++++
 tools/perf/tests/insn-x86-dat-src.c                | 46 ++++++++++++++++++++++
 .../perf/util/intel-pt-decoder/x86-opcode-map.txt  |  4 +-
 5 files changed, 106 insertions(+), 4 deletions(-)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index 25dad388b371..f4f0451a301e 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -943,8 +943,8 @@ GrpTable: Grp15
 3: vstmxcsr Md (v1) | WRGSBASE Ry (F3),(11B)
 4: XSAVE
 5: XRSTOR | lfence (11B)
-6: XSAVEOPT | mfence (11B)
-7: clflush | sfence (11B)
+6: XSAVEOPT | clwb (66) | mfence (11B)
+7: clflush | clflushopt (66) | sfence (11B) | pcommit (66),(11B)
 EndTable
 
 GrpTable: Grp16
diff --git a/tools/perf/tests/insn-x86-dat-32.c b/tools/perf/tests/insn-x86-dat-32.c
index 83f5078e74e1..4b09b7e130a0 100644
--- a/tools/perf/tests/insn-x86-dat-32.c
+++ b/tools/perf/tests/insn-x86-dat-32.c
@@ -616,3 +616,25 @@
 "0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,1),%xmm0",},
 {{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
 "0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,8),%xmm0",},
+{{0x66, 0x0f, 0xae, 0x38, }, 4, 0, "", "",
+"66 0f ae 38          \tclflushopt (%eax)",},
+{{0x66, 0x0f, 0xae, 0x3d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f ae 3d 78 56 34 12 \tclflushopt 0x12345678",},
+{{0x66, 0x0f, 0xae, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f ae bc c8 78 56 34 12 \tclflushopt 0x12345678(%eax,%ecx,8)",},
+{{0x0f, 0xae, 0x38, }, 3, 0, "", "",
+"0f ae 38             \tclflush (%eax)",},
+{{0x0f, 0xae, 0xf8, }, 3, 0, "", "",
+"0f ae f8             \tsfence ",},
+{{0x66, 0x0f, 0xae, 0x30, }, 4, 0, "", "",
+"66 0f ae 30          \tclwb   (%eax)",},
+{{0x66, 0x0f, 0xae, 0x35, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
+"66 0f ae 35 78 56 34 12 \tclwb   0x12345678",},
+{{0x66, 0x0f, 0xae, 0xb4, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f ae b4 c8 78 56 34 12 \tclwb   0x12345678(%eax,%ecx,8)",},
+{{0x0f, 0xae, 0x30, }, 3, 0, "", "",
+"0f ae 30             \txsaveopt (%eax)",},
+{{0x0f, 0xae, 0xf0, }, 3, 0, "", "",
+"0f ae f0             \tmfence ",},
+{{0x66, 0x0f, 0xae, 0xf8, }, 4, 0, "", "",
+"66 0f ae f8          \tpcommit ",},
diff --git a/tools/perf/tests/insn-x86-dat-64.c b/tools/perf/tests/insn-x86-dat-64.c
index 13f008588590..5da235a4414f 100644
--- a/tools/perf/tests/insn-x86-dat-64.c
+++ b/tools/perf/tests/insn-x86-dat-64.c
@@ -702,3 +702,37 @@
 "0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm0",},
 {{0x44, 0x0f, 0x38, 0xcd, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
 "44 0f 38 cd bc c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm15",},
+{{0x66, 0x0f, 0xae, 0x38, }, 4, 0, "", "",
+"66 0f ae 38          \tclflushopt (%rax)",},
+{{0x66, 0x41, 0x0f, 0xae, 0x38, }, 5, 0, "", "",
+"66 41 0f ae 38       \tclflushopt (%r8)",},
+{{0x66, 0x0f, 0xae, 0x3c, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f ae 3c 25 78 56 34 12 \tclflushopt 0x12345678",},
+{{0x66, 0x0f, 0xae, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f ae bc c8 78 56 34 12 \tclflushopt 0x12345678(%rax,%rcx,8)",},
+{{0x66, 0x41, 0x0f, 0xae, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"66 41 0f ae bc c8 78 56 34 12 \tclflushopt 0x12345678(%r8,%rcx,8)",},
+{{0x0f, 0xae, 0x38, }, 3, 0, "", "",
+"0f ae 38             \tclflush (%rax)",},
+{{0x41, 0x0f, 0xae, 0x38, }, 4, 0, "", "",
+"41 0f ae 38          \tclflush (%r8)",},
+{{0x0f, 0xae, 0xf8, }, 3, 0, "", "",
+"0f ae f8             \tsfence ",},
+{{0x66, 0x0f, 0xae, 0x30, }, 4, 0, "", "",
+"66 0f ae 30          \tclwb   (%rax)",},
+{{0x66, 0x41, 0x0f, 0xae, 0x30, }, 5, 0, "", "",
+"66 41 0f ae 30       \tclwb   (%r8)",},
+{{0x66, 0x0f, 0xae, 0x34, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f ae 34 25 78 56 34 12 \tclwb   0x12345678",},
+{{0x66, 0x0f, 0xae, 0xb4, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
+"66 0f ae b4 c8 78 56 34 12 \tclwb   0x12345678(%rax,%rcx,8)",},
+{{0x66, 0x41, 0x0f, 0xae, 0xb4, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
+"66 41 0f ae b4 c8 78 56 34 12 \tclwb   0x12345678(%r8,%rcx,8)",},
+{{0x0f, 0xae, 0x30, }, 3, 0, "", "",
+"0f ae 30             \txsaveopt (%rax)",},
+{{0x41, 0x0f, 0xae, 0x30, }, 4, 0, "", "",
+"41 0f ae 30          \txsaveopt (%r8)",},
+{{0x0f, 0xae, 0xf0, }, 3, 0, "", "",
+"0f ae f0             \tmfence ",},
+{{0x66, 0x0f, 0xae, 0xf8, }, 4, 0, "", "",
+"66 0f ae f8          \tpcommit ",},
diff --git a/tools/perf/tests/insn-x86-dat-src.c b/tools/perf/tests/insn-x86-dat-src.c
index 7d06c9b22070..482637f44245 100644
--- a/tools/perf/tests/insn-x86-dat-src.c
+++ b/tools/perf/tests/insn-x86-dat-src.c
@@ -421,6 +421,30 @@ int main(void)
 	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm0");
 	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm15");
 
+	/* clflushopt m8 */
+
+	asm volatile("clflushopt (%rax)");
+	asm volatile("clflushopt (%r8)");
+	asm volatile("clflushopt (0x12345678)");
+	asm volatile("clflushopt 0x12345678(%rax,%rcx,8)");
+	asm volatile("clflushopt 0x12345678(%r8,%rcx,8)");
+	/* Also check instructions in the same group encoding as clflushopt */
+	asm volatile("clflush (%rax)");
+	asm volatile("clflush (%r8)");
+	asm volatile("sfence");
+
+	/* clwb m8 */
+
+	asm volatile("clwb (%rax)");
+	asm volatile("clwb (%r8)");
+	asm volatile("clwb (0x12345678)");
+	asm volatile("clwb 0x12345678(%rax,%rcx,8)");
+	asm volatile("clwb 0x12345678(%r8,%rcx,8)");
+	/* Also check instructions in the same group encoding as clwb */
+	asm volatile("xsaveopt (%rax)");
+	asm volatile("xsaveopt (%r8)");
+	asm volatile("mfence");
+
 #else  /* #ifdef __x86_64__ */
 
 	/* bndmk m32, bnd */
@@ -780,8 +804,30 @@ int main(void)
 	asm volatile("sha256msg2 0x12345678(%eax,%ecx,1), %xmm0");
 	asm volatile("sha256msg2 0x12345678(%eax,%ecx,8), %xmm0");
 
+	/* clflushopt m8 */
+
+	asm volatile("clflushopt (%eax)");
+	asm volatile("clflushopt (0x12345678)");
+	asm volatile("clflushopt 0x12345678(%eax,%ecx,8)");
+	/* Also check instructions in the same group encoding as clflushopt */
+	asm volatile("clflush (%eax)");
+	asm volatile("sfence");
+
+	/* clwb m8 */
+
+	asm volatile("clwb (%eax)");
+	asm volatile("clwb (0x12345678)");
+	asm volatile("clwb 0x12345678(%eax,%ecx,8)");
+	/* Also check instructions in the same group encoding as clwb */
+	asm volatile("xsaveopt (%eax)");
+	asm volatile("mfence");
+
 #endif /* #ifndef __x86_64__ */
 
+	/* pcommit */
+
+	asm volatile("pcommit");
+
 	/* Following line is a marker for the awk script - do not change */
 	asm volatile("rdtsc"); /* Stop here */
 
diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
index 25dad388b371..f4f0451a301e 100644
--- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
+++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
@@ -943,8 +943,8 @@ GrpTable: Grp15
 3: vstmxcsr Md (v1) | WRGSBASE Ry (F3),(11B)
 4: XSAVE
 5: XRSTOR | lfence (11B)
-6: XSAVEOPT | mfence (11B)
-7: clflush | sfence (11B)
+6: XSAVEOPT | clwb (66) | mfence (11B)
+7: clflush | clflushopt (66) | sfence (11B) | pcommit (66),(11B)
 EndTable
 
 GrpTable: Grp16
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-08-31 13:58 [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Adrian Hunter
                   ` (3 preceding siblings ...)
  2015-08-31 13:58 ` [PATCH 4/4] x86/insn: perf tools: Add new memory instructions Adrian Hunter
@ 2015-08-31 14:43 ` Arnaldo Carvalho de Melo
  2015-09-01  8:54 ` Ingo Molnar
  5 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 14:43 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Em Mon, Aug 31, 2015 at 04:58:38PM +0300, Adrian Hunter escreveu:
> Hi
> 
> perf tools has a copy of the x86 instruction decoder for decoding
> Intel PT.  This patch set adds a perf tools test to use it to
> test new instructions.  Subsequent patches add a few new x86
> instructions, or very slightly modify them in the case of MPX.
> Those changes affect both perf tools and x86/insn.
> 
> I suggest Arnaldo takes all these patches as they mainly affect
> perf tools, at least in terms of lines-of-code.

I'll process them, anyone thinking this shouldn't be the case, holler.

- Arnaldo
 
> 
> Adrian Hunter (4):
>       perf tools: Add a test for decoding of new x86 instructions
>       x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions
>       x86/insn: perf tools: Add new SHA instructions
>       x86/insn: perf tools: Add new memory instructions
> 
>  arch/x86/lib/x86-opcode-map.txt                    |  19 +-
>  tools/perf/tests/Build                             |   3 +
>  tools/perf/tests/builtin-test.c                    |   8 +
>  tools/perf/tests/gen-insn-x86-dat.awk              |  75 ++
>  tools/perf/tests/gen-insn-x86-dat.sh               |  43 ++
>  tools/perf/tests/insn-x86-dat-32.c                 | 640 ++++++++++++++++
>  tools/perf/tests/insn-x86-dat-64.c                 | 738 ++++++++++++++++++
>  tools/perf/tests/insn-x86-dat-src.c                | 835 +++++++++++++++++++++
>  tools/perf/tests/insn-x86.c                        | 180 +++++
>  tools/perf/tests/tests.h                           |   1 +
>  .../perf/util/intel-pt-decoder/x86-opcode-map.txt  |  19 +-
>  11 files changed, 2553 insertions(+), 8 deletions(-)
>  create mode 100644 tools/perf/tests/gen-insn-x86-dat.awk
>  create mode 100755 tools/perf/tests/gen-insn-x86-dat.sh
>  create mode 100644 tools/perf/tests/insn-x86-dat-32.c
>  create mode 100644 tools/perf/tests/insn-x86-dat-64.c
>  create mode 100644 tools/perf/tests/insn-x86-dat-src.c
>  create mode 100644 tools/perf/tests/insn-x86.c
> 
> 
> Regards
> Adrian

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/4] x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions
  2015-08-31 13:58 ` [PATCH 2/4] x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions Adrian Hunter
@ 2015-08-31 14:48   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 14:48 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Em Mon, Aug 31, 2015 at 04:58:40PM +0300, Adrian Hunter escreveu:
> The MPX instructions are presently not described in the SDM
> opcode maps, and there are not encoding characters for bnd
> registers, address method or operand type.  So the kernel
> opcode map is using 'Gv' for bnd registers and 'Ev' for
> everything else.  That is fine because the instruction
> decoder does not use that information anyway, except as
> an indication that there is a ModR/M byte.
> 
> Nevertheless, in some cases the 'Gv' and 'Ev' are the wrong
> way around, BNDLDX and BNDSTX have 2 operands not 3, and it
> wouldn't hurt to identify the mandatory prefixes.
> 
> This has no effect on the decoding of valid instructions,
> but the addition of the mandatory prefixes will cause some
> invalid instructions to error out that wouldn't have
> previously.
> 
> Note that perf tools has a copy of the instruction decoder
> and provides a test for new instructions which includes MPX
> instructions e.g.
> 
> 	$ perf test list 2>&1 | grep "x86 ins"
> 	39: Test x86 instruction decoder - new instructions
> 	$ perf test 39
> 	39: Test x86 instruction decoder - new instructions          : Ok
> 
> Or to see the details:
> 
> 	$ perf test -v 39

this is a handy shortcut, but I think that sometimes showing that one
can also do it using the test description is handy, i.e. I forgot the
number of it was reordered, doing it like:

[root@zoo linux]# perf test syscall
 2: detect openat syscall event                              : Ok
 3: detect openat syscall event on all cpus                  : Ok
14: Generate and check syscalls:sys_enter_openat event fields: Ok
[root@zoo linux]#

Also works, which, for your case would be 'perf test x86', oops, that
would also catch:

[root@zoo linux]# perf test x86
 6: x86 rdpmc test                                           : Ok
[root@zoo linux]#

:-)

- Arnaldo
 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  arch/x86/lib/x86-opcode-map.txt                     | 8 ++++++--
>  tools/perf/util/intel-pt-decoder/x86-opcode-map.txt | 8 ++++++--
>  2 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
> index 816488c0b97e..a02a195d219c 100644
> --- a/arch/x86/lib/x86-opcode-map.txt
> +++ b/arch/x86/lib/x86-opcode-map.txt
> @@ -353,8 +353,12 @@ AVXcode: 1
>  17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1)
>  18: Grp16 (1A)
>  19:
> -1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv
> -1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv
> +# Intel SDM opcode map does not list MPX instructions. For now using Gv for
> +# bnd registers and Ev for everything else is OK because the instruction
> +# decoder does not use the information except as an indication that there is
> +# a ModR/M byte.
> +1a: BNDCL Gv,Ev (F3) | BNDCU Gv,Ev (F2) | BNDMOV Gv,Ev (66) | BNDLDX Gv,Ev
> +1b: BNDCN Gv,Ev (F2) | BNDMOV Ev,Gv (66) | BNDMK Gv,Ev (F3) | BNDSTX Ev,Gv
>  1c:
>  1d:
>  1e:
> diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> index 816488c0b97e..a02a195d219c 100644
> --- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> +++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> @@ -353,8 +353,12 @@ AVXcode: 1
>  17: vmovhps Mq,Vq (v1) | vmovhpd Mq,Vq (66),(v1)
>  18: Grp16 (1A)
>  19:
> -1a: BNDCL Ev,Gv | BNDCU Ev,Gv | BNDMOV Gv,Ev | BNDLDX Gv,Ev,Gv
> -1b: BNDCN Ev,Gv | BNDMOV Ev,Gv | BNDMK Gv,Ev | BNDSTX Ev,GV,Gv
> +# Intel SDM opcode map does not list MPX instructions. For now using Gv for
> +# bnd registers and Ev for everything else is OK because the instruction
> +# decoder does not use the information except as an indication that there is
> +# a ModR/M byte.
> +1a: BNDCL Gv,Ev (F3) | BNDCU Gv,Ev (F2) | BNDMOV Gv,Ev (66) | BNDLDX Gv,Ev
> +1b: BNDCN Gv,Ev (F2) | BNDMOV Ev,Gv (66) | BNDMK Gv,Ev (F3) | BNDSTX Ev,Gv
>  1c:
>  1d:
>  1e:
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions
  2015-08-31 13:58 ` [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions Adrian Hunter
@ 2015-08-31 14:50   ` Arnaldo Carvalho de Melo
  2015-08-31 18:58     ` Adrian Hunter
  2015-09-01  0:08   ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-31 14:50 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Em Mon, Aug 31, 2015 at 04:58:41PM +0300, Adrian Hunter escreveu:
> Intel SHA Extensions are explained in the Intel Architecture
> Instruction Set Extensions Programing Reference (Oct 2014).
> There are 7 new instructions.  Add them to the op code map
> and the perf tools new instructions test. e.g.
> 
>     $ tools/perf/perf test list 2>&1 | grep "x86 ins"

I.e., one could short circuit the 'perf test list' step and use:

	perf test "x86 ins" straight away:

[root@zoo linux]# perf test "syscall event"
 2: detect openat syscall event                              : Ok
 3: detect openat syscall event on all cpus                  : Ok
[root@zoo linux]#

>     39: Test x86 instruction decoder - new instructions
>     $ tools/perf/perf test 39
>     39: Test x86 instruction decoder - new instructions          : Ok
> 
> Or to see the details:
> 
>     $ tools/perf/perf test -v 39 2>&1 | grep sha
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  arch/x86/lib/x86-opcode-map.txt                    |   7 +
>  tools/perf/tests/insn-x86-dat-32.c                 | 294 ++++++++++++++++
>  tools/perf/tests/insn-x86-dat-64.c                 | 364 ++++++++++++++++++++
>  tools/perf/tests/insn-x86-dat-src.c                | 373 +++++++++++++++++++++
>  .../perf/util/intel-pt-decoder/x86-opcode-map.txt  |   7 +
>  5 files changed, 1045 insertions(+)
> 
> diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
> index a02a195d219c..25dad388b371 100644
> --- a/arch/x86/lib/x86-opcode-map.txt
> +++ b/arch/x86/lib/x86-opcode-map.txt
> @@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
>  be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
>  bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
>  # 0x0f 0x38 0xc0-0xff
> +c8: sha1nexte Vdq,Wdq
> +c9: sha1msg1 Vdq,Wdq
> +ca: sha1msg2 Vdq,Wdq
> +cb: sha256rnds2 Vdq,Wdq
> +cc: sha256msg1 Vdq,Wdq
> +cd: sha256msg2 Vdq,Wdq
>  db: VAESIMC Vdq,Wdq (66),(v1)
>  dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
>  dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
> @@ -794,6 +800,7 @@ AVXcode: 3
>  61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
>  62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
>  63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
> +cc: sha1rnds4 Vdq,Wdq,Ib
>  df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
>  f0: RORX Gy,Ey,Ib (F2),(v)
>  EndTable
> diff --git a/tools/perf/tests/insn-x86-dat-32.c b/tools/perf/tests/insn-x86-dat-32.c
> index 6a38a34a5a49..83f5078e74e1 100644
> --- a/tools/perf/tests/insn-x86-dat-32.c
> +++ b/tools/perf/tests/insn-x86-dat-32.c
> @@ -322,3 +322,297 @@
>  "f2 ff 21             \tbnd jmp *(%ecx)",},
>  {{0xf2, 0x0f, 0x85, 0xfc, 0xff, 0xff, 0xff, }, 7, 0xfffffffc, "jcc", "conditional",
>  "f2 0f 85 fc ff ff ff \tbnd jne 3de <main+0x3de>",},
> +{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
> +"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
> +"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
> +{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%eax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%eax),%xmm3",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%eax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%ebp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
> +"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
> +"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
> +"0f 38 c8 00          \tsha1nexte (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 05 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
> +"0f 38 c8 18          \tsha1nexte (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c8 04 01       \tsha1nexte (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c8 04 08       \tsha1nexte (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c8 04 c8       \tsha1nexte (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 40 12       \tsha1nexte 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 45 12       \tsha1nexte 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 01 12    \tsha1nexte 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 05 12    \tsha1nexte 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 08 12    \tsha1nexte 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
> +"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
> +"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
> +"0f 38 c9 00          \tsha1msg1 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 05 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
> +"0f 38 c9 18          \tsha1msg1 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c9 04 01       \tsha1msg1 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c9 04 08       \tsha1msg1 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c9 04 c8       \tsha1msg1 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 40 12       \tsha1msg1 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 45 12       \tsha1msg1 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 01 12    \tsha1msg1 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 05 12    \tsha1msg1 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 08 12    \tsha1msg1 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
> +"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
> +"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
> +"0f 38 ca 00          \tsha1msg2 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 05 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
> +"0f 38 ca 18          \tsha1msg2 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 ca 04 01       \tsha1msg2 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 ca 04 08       \tsha1msg2 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 ca 04 c8       \tsha1msg2 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 40 12       \tsha1msg2 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 45 12       \tsha1msg2 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 01 12    \tsha1msg2 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 05 12    \tsha1msg2 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 08 12    \tsha1msg2 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
> +"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
> +"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
> +"0f 38 cb 08          \tsha256rnds2 %xmm0,(%eax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 0d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
> +"0f 38 cb 18          \tsha256rnds2 %xmm0,(%eax),%xmm3",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
> +"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%ecx,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
> +"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%eax,%ecx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
> +"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%eax,%ecx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%eax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%ebp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%ecx,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%ebp,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ecx,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
> +"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
> +"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
> +"0f 38 cc 00          \tsha256msg1 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 05 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
> +"0f 38 cc 18          \tsha256msg1 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cc 04 01       \tsha256msg1 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cc 04 08       \tsha256msg1 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cc 04 c8       \tsha256msg1 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 40 12       \tsha256msg1 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 45 12       \tsha256msg1 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 01 12    \tsha256msg1 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 05 12    \tsha256msg1 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 08 12    \tsha256msg1 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
> +"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
> +"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
> +"0f 38 cd 00          \tsha256msg2 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 05 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
> +"0f 38 cd 18          \tsha256msg2 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cd 04 01       \tsha256msg2 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cd 04 08       \tsha256msg2 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cd 04 c8       \tsha256msg2 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 40 12       \tsha256msg2 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 45 12       \tsha256msg2 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 01 12    \tsha256msg2 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 05 12    \tsha256msg2 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 08 12    \tsha256msg2 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,8),%xmm0",},
> diff --git a/tools/perf/tests/insn-x86-dat-64.c b/tools/perf/tests/insn-x86-dat-64.c
> index 01122421a776..13f008588590 100644
> --- a/tools/perf/tests/insn-x86-dat-64.c
> +++ b/tools/perf/tests/insn-x86-dat-64.c
> @@ -338,3 +338,367 @@
>  "67 f2 ff 21          \tbnd jmpq *(%ecx)",},
>  {{0xf2, 0x0f, 0x85, 0x00, 0x00, 0x00, 0x00, }, 7, 0, "jcc", "conditional",
>  "f2 0f 85 00 00 00 00 \tbnd jne 413 <main+0x413>",},
> +{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
> +"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
> +"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x3a, 0xcc, 0xc0, 0x91, }, 6, 0, "", "",
> +"41 0f 3a cc c0 91    \tsha1rnds4 $0x91,%xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
> +"44 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
> +"45 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm15,%xmm8",},
> +{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%rax),%xmm0",},
> +{{0x41, 0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 6, 0, "", "",
> +"41 0f 3a cc 00 91    \tsha1rnds4 $0x91,(%r8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 04 25 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%rax),%xmm3",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%rax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%rbp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x3a, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 11, 0, "", "",
> +"44 0f 3a cc bc c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
> +"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
> +"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xc8, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 c8 c0       \tsha1nexte %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 c8 c7       \tsha1nexte %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 c8 c7       \tsha1nexte %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
> +"0f 38 c8 00          \tsha1nexte (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xc8, 0x00, }, 5, 0, "", "",
> +"41 0f 38 c8 00       \tsha1nexte (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 04 25 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
> +"0f 38 c8 18          \tsha1nexte (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c8 04 01       \tsha1nexte (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c8 04 08       \tsha1nexte (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c8 04 c8       \tsha1nexte (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 40 12       \tsha1nexte 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 45 12       \tsha1nexte 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 01 12    \tsha1nexte 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 05 12    \tsha1nexte 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 08 12    \tsha1nexte 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc8, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 c8 bc c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
> +"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
> +"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xc9, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 c9 c0       \tsha1msg1 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 c9 c7       \tsha1msg1 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 c9 c7       \tsha1msg1 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
> +"0f 38 c9 00          \tsha1msg1 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xc9, 0x00, }, 5, 0, "", "",
> +"41 0f 38 c9 00       \tsha1msg1 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 04 25 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
> +"0f 38 c9 18          \tsha1msg1 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c9 04 01       \tsha1msg1 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c9 04 08       \tsha1msg1 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c9 04 c8       \tsha1msg1 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 40 12       \tsha1msg1 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 45 12       \tsha1msg1 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 01 12    \tsha1msg1 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 05 12    \tsha1msg1 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 08 12    \tsha1msg1 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc9, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 c9 bc c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
> +"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
> +"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xca, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 ca c0       \tsha1msg2 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 ca c7       \tsha1msg2 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 ca c7       \tsha1msg2 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
> +"0f 38 ca 00          \tsha1msg2 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xca, 0x00, }, 5, 0, "", "",
> +"41 0f 38 ca 00       \tsha1msg2 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 04 25 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
> +"0f 38 ca 18          \tsha1msg2 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 ca 04 01       \tsha1msg2 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 ca 04 08       \tsha1msg2 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 ca 04 c8       \tsha1msg2 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 40 12       \tsha1msg2 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 45 12       \tsha1msg2 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 01 12    \tsha1msg2 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 05 12    \tsha1msg2 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 08 12    \tsha1msg2 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xca, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 ca bc c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
> +"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
> +"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xcb, 0xc8, }, 5, 0, "", "",
> +"41 0f 38 cb c8       \tsha256rnds2 %xmm0,%xmm8,%xmm1",},
> +{{0x44, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
> +"0f 38 cb 08          \tsha256rnds2 %xmm0,(%rax),%xmm1",},
> +{{0x41, 0x0f, 0x38, 0xcb, 0x08, }, 5, 0, "", "",
> +"41 0f 38 cb 08       \tsha256rnds2 %xmm0,(%r8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 0c 25 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
> +"0f 38 cb 18          \tsha256rnds2 %xmm0,(%rax),%xmm3",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
> +"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%rcx,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
> +"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%rax,%rcx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
> +"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%rax,%rcx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%rax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%rbp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%rcx,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%rbp,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rcx,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm1",},
> +{{0x44, 0x0f, 0x38, 0xcb, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 cb bc c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
> +"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
> +"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xcc, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 cc c0       \tsha256msg1 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 cc c7       \tsha256msg1 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 cc c7       \tsha256msg1 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
> +"0f 38 cc 00          \tsha256msg1 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xcc, 0x00, }, 5, 0, "", "",
> +"41 0f 38 cc 00       \tsha256msg1 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 04 25 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
> +"0f 38 cc 18          \tsha256msg1 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cc 04 01       \tsha256msg1 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cc 04 08       \tsha256msg1 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cc 04 c8       \tsha256msg1 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 40 12       \tsha256msg1 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 45 12       \tsha256msg1 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 01 12    \tsha256msg1 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 05 12    \tsha256msg1 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 08 12    \tsha256msg1 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 cc bc c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
> +"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
> +"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xcd, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 cd c0       \tsha256msg2 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 cd c7       \tsha256msg2 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 cd c7       \tsha256msg2 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
> +"0f 38 cd 00          \tsha256msg2 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xcd, 0x00, }, 5, 0, "", "",
> +"41 0f 38 cd 00       \tsha256msg2 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 04 25 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
> +"0f 38 cd 18          \tsha256msg2 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cd 04 01       \tsha256msg2 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cd 04 08       \tsha256msg2 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cd 04 c8       \tsha256msg2 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 40 12       \tsha256msg2 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 45 12       \tsha256msg2 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 01 12    \tsha256msg2 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 05 12    \tsha256msg2 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 08 12    \tsha256msg2 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcd, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 cd bc c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm15",},
> diff --git a/tools/perf/tests/insn-x86-dat-src.c b/tools/perf/tests/insn-x86-dat-src.c
> index b506830f33a8..7d06c9b22070 100644
> --- a/tools/perf/tests/insn-x86-dat-src.c
> +++ b/tools/perf/tests/insn-x86-dat-src.c
> @@ -217,6 +217,210 @@ int main(void)
>  	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
>  	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0 */
>  
> +	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
> +	asm volatile("sha1rnds4 $0x91, %xmm8, %xmm0");
> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm8");
> +	asm volatile("sha1rnds4 $0x91, %xmm15, %xmm8");
> +	asm volatile("sha1rnds4 $0x91, (%rax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%r8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%rax), %xmm3");
> +	asm volatile("sha1rnds4 $0x91, (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rbp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha1nexte xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1nexte %xmm1, %xmm0");
> +	asm volatile("sha1nexte %xmm7, %xmm2");
> +	asm volatile("sha1nexte %xmm8, %xmm0");
> +	asm volatile("sha1nexte %xmm7, %xmm8");
> +	asm volatile("sha1nexte %xmm15, %xmm8");
> +	asm volatile("sha1nexte (%rax), %xmm0");
> +	asm volatile("sha1nexte (%r8), %xmm0");
> +	asm volatile("sha1nexte (0x12345678), %xmm0");
> +	asm volatile("sha1nexte (%rax), %xmm3");
> +	asm volatile("sha1nexte (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1nexte (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1nexte (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rax), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rbp), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha1msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg1 %xmm1, %xmm0");
> +	asm volatile("sha1msg1 %xmm7, %xmm2");
> +	asm volatile("sha1msg1 %xmm8, %xmm0");
> +	asm volatile("sha1msg1 %xmm7, %xmm8");
> +	asm volatile("sha1msg1 %xmm15, %xmm8");
> +	asm volatile("sha1msg1 (%rax), %xmm0");
> +	asm volatile("sha1msg1 (%r8), %xmm0");
> +	asm volatile("sha1msg1 (0x12345678), %xmm0");
> +	asm volatile("sha1msg1 (%rax), %xmm3");
> +	asm volatile("sha1msg1 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg1 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rax), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rbp), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha1msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg2 %xmm1, %xmm0");
> +	asm volatile("sha1msg2 %xmm7, %xmm2");
> +	asm volatile("sha1msg2 %xmm8, %xmm0");
> +	asm volatile("sha1msg2 %xmm7, %xmm8");
> +	asm volatile("sha1msg2 %xmm15, %xmm8");
> +	asm volatile("sha1msg2 (%rax), %xmm0");
> +	asm volatile("sha1msg2 (%r8), %xmm0");
> +	asm volatile("sha1msg2 (0x12345678), %xmm0");
> +	asm volatile("sha1msg2 (%rax), %xmm3");
> +	asm volatile("sha1msg2 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg2 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rax), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rbp), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
> +	/* Note sha256rnds2 has an implicit operand 'xmm0' */
> +
> +	asm volatile("sha256rnds2 %xmm4, %xmm1");
> +	asm volatile("sha256rnds2 %xmm7, %xmm2");
> +	asm volatile("sha256rnds2 %xmm8, %xmm1");
> +	asm volatile("sha256rnds2 %xmm7, %xmm8");
> +	asm volatile("sha256rnds2 %xmm15, %xmm8");
> +	asm volatile("sha256rnds2 (%rax), %xmm1");
> +	asm volatile("sha256rnds2 (%r8), %xmm1");
> +	asm volatile("sha256rnds2 (0x12345678), %xmm1");
> +	asm volatile("sha256rnds2 (%rax), %xmm3");
> +	asm volatile("sha256rnds2 (%rcx,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 (%rax,%rcx,1), %xmm1");
> +	asm volatile("sha256rnds2 (%rax,%rcx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rax), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rbp), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rcx,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rbp,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rax,%rcx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rax,%rcx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rbp), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rcx,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rbp,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha256msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg1 %xmm1, %xmm0");
> +	asm volatile("sha256msg1 %xmm7, %xmm2");
> +	asm volatile("sha256msg1 %xmm8, %xmm0");
> +	asm volatile("sha256msg1 %xmm7, %xmm8");
> +	asm volatile("sha256msg1 %xmm15, %xmm8");
> +	asm volatile("sha256msg1 (%rax), %xmm0");
> +	asm volatile("sha256msg1 (%r8), %xmm0");
> +	asm volatile("sha256msg1 (0x12345678), %xmm0");
> +	asm volatile("sha256msg1 (%rax), %xmm3");
> +	asm volatile("sha256msg1 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg1 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rax), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rbp), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha256msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg2 %xmm1, %xmm0");
> +	asm volatile("sha256msg2 %xmm7, %xmm2");
> +	asm volatile("sha256msg2 %xmm8, %xmm0");
> +	asm volatile("sha256msg2 %xmm7, %xmm8");
> +	asm volatile("sha256msg2 %xmm15, %xmm8");
> +	asm volatile("sha256msg2 (%rax), %xmm0");
> +	asm volatile("sha256msg2 (%r8), %xmm0");
> +	asm volatile("sha256msg2 (0x12345678), %xmm0");
> +	asm volatile("sha256msg2 (%rax), %xmm3");
> +	asm volatile("sha256msg2 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg2 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rax), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rbp), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm15");
> +
>  #else  /* #ifdef __x86_64__ */
>  
>  	/* bndmk m32, bnd */
> @@ -407,6 +611,175 @@ int main(void)
>  	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
>  	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0xfffffffc */
>  
> +	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
> +	asm volatile("sha1rnds4 $0x91, (%eax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%eax), %xmm3");
> +	asm volatile("sha1rnds4 $0x91, (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%ebp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha1nexte xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1nexte %xmm1, %xmm0");
> +	asm volatile("sha1nexte %xmm7, %xmm2");
> +	asm volatile("sha1nexte (%eax), %xmm0");
> +	asm volatile("sha1nexte (0x12345678), %xmm0");
> +	asm volatile("sha1nexte (%eax), %xmm3");
> +	asm volatile("sha1nexte (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1nexte (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1nexte (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12(%eax), %xmm0");
> +	asm volatile("sha1nexte 0x12(%ebp), %xmm0");
> +	asm volatile("sha1nexte 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha1msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg1 %xmm1, %xmm0");
> +	asm volatile("sha1msg1 %xmm7, %xmm2");
> +	asm volatile("sha1msg1 (%eax), %xmm0");
> +	asm volatile("sha1msg1 (0x12345678), %xmm0");
> +	asm volatile("sha1msg1 (%eax), %xmm3");
> +	asm volatile("sha1msg1 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg1 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12(%eax), %xmm0");
> +	asm volatile("sha1msg1 0x12(%ebp), %xmm0");
> +	asm volatile("sha1msg1 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha1msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg2 %xmm1, %xmm0");
> +	asm volatile("sha1msg2 %xmm7, %xmm2");
> +	asm volatile("sha1msg2 (%eax), %xmm0");
> +	asm volatile("sha1msg2 (0x12345678), %xmm0");
> +	asm volatile("sha1msg2 (%eax), %xmm3");
> +	asm volatile("sha1msg2 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg2 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12(%eax), %xmm0");
> +	asm volatile("sha1msg2 0x12(%ebp), %xmm0");
> +	asm volatile("sha1msg2 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
> +	/* Note sha256rnds2 has an implicit operand 'xmm0' */
> +
> +	asm volatile("sha256rnds2 %xmm4, %xmm1");
> +	asm volatile("sha256rnds2 %xmm7, %xmm2");
> +	asm volatile("sha256rnds2 (%eax), %xmm1");
> +	asm volatile("sha256rnds2 (0x12345678), %xmm1");
> +	asm volatile("sha256rnds2 (%eax), %xmm3");
> +	asm volatile("sha256rnds2 (%ecx,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 (%eax,%ecx,1), %xmm1");
> +	asm volatile("sha256rnds2 (%eax,%ecx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%eax), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%ebp), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%ecx,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%ebp,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%eax,%ecx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%eax,%ecx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%eax), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%ebp), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%ecx,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%ebp,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,8), %xmm1");
> +
> +	/* sha256msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg1 %xmm1, %xmm0");
> +	asm volatile("sha256msg1 %xmm7, %xmm2");
> +	asm volatile("sha256msg1 (%eax), %xmm0");
> +	asm volatile("sha256msg1 (0x12345678), %xmm0");
> +	asm volatile("sha256msg1 (%eax), %xmm3");
> +	asm volatile("sha256msg1 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg1 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12(%eax), %xmm0");
> +	asm volatile("sha256msg1 0x12(%ebp), %xmm0");
> +	asm volatile("sha256msg1 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%eax), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha256msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg2 %xmm1, %xmm0");
> +	asm volatile("sha256msg2 %xmm7, %xmm2");
> +	asm volatile("sha256msg2 (%eax), %xmm0");
> +	asm volatile("sha256msg2 (0x12345678), %xmm0");
> +	asm volatile("sha256msg2 (%eax), %xmm3");
> +	asm volatile("sha256msg2 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg2 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12(%eax), %xmm0");
> +	asm volatile("sha256msg2 0x12(%ebp), %xmm0");
> +	asm volatile("sha256msg2 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%eax), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%eax,%ecx,8), %xmm0");
> +
>  #endif /* #ifndef __x86_64__ */
>  
>  	/* Following line is a marker for the awk script - do not change */
> diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> index a02a195d219c..25dad388b371 100644
> --- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> +++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> @@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
>  be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
>  bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
>  # 0x0f 0x38 0xc0-0xff
> +c8: sha1nexte Vdq,Wdq
> +c9: sha1msg1 Vdq,Wdq
> +ca: sha1msg2 Vdq,Wdq
> +cb: sha256rnds2 Vdq,Wdq
> +cc: sha256msg1 Vdq,Wdq
> +cd: sha256msg2 Vdq,Wdq
>  db: VAESIMC Vdq,Wdq (66),(v1)
>  dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
>  dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
> @@ -794,6 +800,7 @@ AVXcode: 3
>  61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
>  62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
>  63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
> +cc: sha1rnds4 Vdq,Wdq,Ib
>  df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
>  f0: RORX Gy,Ey,Ib (F2),(v)
>  EndTable
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions
  2015-08-31 14:50   ` Arnaldo Carvalho de Melo
@ 2015-08-31 18:58     ` Adrian Hunter
  0 siblings, 0 replies; 27+ messages in thread
From: Adrian Hunter @ 2015-08-31 18:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Jiri Olsa, Andy Lutomirski, Masami Hiramatsu,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

On 31/08/2015 5:50 p.m., Arnaldo Carvalho de Melo wrote:
> Em Mon, Aug 31, 2015 at 04:58:41PM +0300, Adrian Hunter escreveu:
>> Intel SHA Extensions are explained in the Intel Architecture
>> Instruction Set Extensions Programing Reference (Oct 2014).
>> There are 7 new instructions.  Add them to the op code map
>> and the perf tools new instructions test. e.g.
>>
>>      $ tools/perf/perf test list 2>&1 | grep "x86 ins"
>
> I.e., one could short circuit the 'perf test list' step and use:
>
> 	perf test "x86 ins" straight away:
>
> [root@zoo linux]# perf test "syscall event"
>   2: detect openat syscall event                              : Ok
>   3: detect openat syscall event on all cpus                  : Ok
> [root@zoo linux]#

Cool, I'll update the commit messages in V2

>
>>      39: Test x86 instruction decoder - new instructions
>>      $ tools/perf/perf test 39
>>      39: Test x86 instruction decoder - new instructions          : Ok
>>
>> Or to see the details:
>>
>>      $ tools/perf/perf test -v 39 2>&1 | grep sha
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>   arch/x86/lib/x86-opcode-map.txt                    |   7 +
>>   tools/perf/tests/insn-x86-dat-32.c                 | 294 ++++++++++++++++
>>   tools/perf/tests/insn-x86-dat-64.c                 | 364 ++++++++++++++++++++
>>   tools/perf/tests/insn-x86-dat-src.c                | 373 +++++++++++++++++++++
>>   .../perf/util/intel-pt-decoder/x86-opcode-map.txt  |   7 +
>>   5 files changed, 1045 insertions(+)
>>
>> diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
>> index a02a195d219c..25dad388b371 100644
>> --- a/arch/x86/lib/x86-opcode-map.txt
>> +++ b/arch/x86/lib/x86-opcode-map.txt
>> @@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
>>   be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
>>   bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
>>   # 0x0f 0x38 0xc0-0xff
>> +c8: sha1nexte Vdq,Wdq
>> +c9: sha1msg1 Vdq,Wdq
>> +ca: sha1msg2 Vdq,Wdq
>> +cb: sha256rnds2 Vdq,Wdq
>> +cc: sha256msg1 Vdq,Wdq
>> +cd: sha256msg2 Vdq,Wdq
>>   db: VAESIMC Vdq,Wdq (66),(v1)
>>   dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
>>   dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
>> @@ -794,6 +800,7 @@ AVXcode: 3
>>   61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
>>   62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
>>   63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
>> +cc: sha1rnds4 Vdq,Wdq,Ib
>>   df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
>>   f0: RORX Gy,Ey,Ib (F2),(v)
>>   EndTable
>> diff --git a/tools/perf/tests/insn-x86-dat-32.c b/tools/perf/tests/insn-x86-dat-32.c
>> index 6a38a34a5a49..83f5078e74e1 100644
>> --- a/tools/perf/tests/insn-x86-dat-32.c
>> +++ b/tools/perf/tests/insn-x86-dat-32.c
>> @@ -322,3 +322,297 @@
>>   "f2 ff 21             \tbnd jmp *(%ecx)",},
>>   {{0xf2, 0x0f, 0x85, 0xfc, 0xff, 0xff, 0xff, }, 7, 0xfffffffc, "jcc", "conditional",
>>   "f2 0f 85 fc ff ff ff \tbnd jne 3de <main+0x3de>",},
>> +{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
>> +"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
>> +"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
>> +{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
>> +"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%eax),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
>> +"0f 3a cc 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
>> +"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%eax),%xmm3",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%eax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%eax),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%ebp),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
>> +"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
>> +"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
>> +"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
>> +"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
>> +{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
>> +"0f 38 c8 00          \tsha1nexte (%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c8 05 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
>> +"0f 38 c8 18          \tsha1nexte (%eax),%xmm3",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 c8 04 01       \tsha1nexte (%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 c8 04 08       \tsha1nexte (%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 c8 04 c8       \tsha1nexte (%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 c8 40 12       \tsha1nexte 0x12(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 c8 45 12       \tsha1nexte 0x12(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 01 12    \tsha1nexte 0x12(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 05 12    \tsha1nexte 0x12(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 08 12    \tsha1nexte 0x12(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
>> +"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
>> +"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
>> +{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
>> +"0f 38 c9 00          \tsha1msg1 (%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c9 05 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
>> +"0f 38 c9 18          \tsha1msg1 (%eax),%xmm3",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 c9 04 01       \tsha1msg1 (%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 c9 04 08       \tsha1msg1 (%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 c9 04 c8       \tsha1msg1 (%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 c9 40 12       \tsha1msg1 0x12(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 c9 45 12       \tsha1msg1 0x12(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 01 12    \tsha1msg1 0x12(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 05 12    \tsha1msg1 0x12(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 08 12    \tsha1msg1 0x12(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
>> +"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
>> +"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
>> +{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
>> +"0f 38 ca 00          \tsha1msg2 (%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 ca 05 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
>> +"0f 38 ca 18          \tsha1msg2 (%eax),%xmm3",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 ca 04 01       \tsha1msg2 (%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 ca 04 08       \tsha1msg2 (%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 ca 04 c8       \tsha1msg2 (%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 ca 40 12       \tsha1msg2 0x12(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 ca 45 12       \tsha1msg2 0x12(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 01 12    \tsha1msg2 0x12(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 05 12    \tsha1msg2 0x12(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 08 12    \tsha1msg2 0x12(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
>> +"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
>> +"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
>> +{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
>> +"0f 38 cb 08          \tsha256rnds2 %xmm0,(%eax),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cb 0d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
>> +"0f 38 cb 18          \tsha256rnds2 %xmm0,(%eax),%xmm3",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
>> +"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%ecx,%eax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%eax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
>> +"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%eax,%ecx,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
>> +"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%eax,%ecx,8),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
>> +"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%eax),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
>> +"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%ebp),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%ecx,%eax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%ebp,%eax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,8),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ecx,%eax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp,%eax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,8),%xmm1",},
>> +{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
>> +"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
>> +"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
>> +{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
>> +"0f 38 cc 00          \tsha256msg1 (%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cc 05 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
>> +"0f 38 cc 18          \tsha256msg1 (%eax),%xmm3",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 cc 04 01       \tsha256msg1 (%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 cc 04 08       \tsha256msg1 (%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 cc 04 c8       \tsha256msg1 (%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 cc 40 12       \tsha256msg1 0x12(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 cc 45 12       \tsha256msg1 0x12(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 01 12    \tsha256msg1 0x12(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 05 12    \tsha256msg1 0x12(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 08 12    \tsha256msg1 0x12(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
>> +"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
>> +"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
>> +{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
>> +"0f 38 cd 00          \tsha256msg2 (%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cd 05 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
>> +"0f 38 cd 18          \tsha256msg2 (%eax),%xmm3",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 cd 04 01       \tsha256msg2 (%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 cd 04 08       \tsha256msg2 (%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 cd 04 c8       \tsha256msg2 (%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 cd 40 12       \tsha256msg2 0x12(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 cd 45 12       \tsha256msg2 0x12(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 01 12    \tsha256msg2 0x12(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 05 12    \tsha256msg2 0x12(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 08 12    \tsha256msg2 0x12(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%eax,%ecx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%eax),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%ebp),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%ecx,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%ebp,%eax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,8),%xmm0",},
>> diff --git a/tools/perf/tests/insn-x86-dat-64.c b/tools/perf/tests/insn-x86-dat-64.c
>> index 01122421a776..13f008588590 100644
>> --- a/tools/perf/tests/insn-x86-dat-64.c
>> +++ b/tools/perf/tests/insn-x86-dat-64.c
>> @@ -338,3 +338,367 @@
>>   "67 f2 ff 21          \tbnd jmpq *(%ecx)",},
>>   {{0xf2, 0x0f, 0x85, 0x00, 0x00, 0x00, 0x00, }, 7, 0, "jcc", "conditional",
>>   "f2 0f 85 00 00 00 00 \tbnd jne 413 <main+0x413>",},
>> +{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
>> +"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
>> +"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
>> +{{0x41, 0x0f, 0x3a, 0xcc, 0xc0, 0x91, }, 6, 0, "", "",
>> +"41 0f 3a cc c0 91    \tsha1rnds4 $0x91,%xmm8,%xmm0",},
>> +{{0x44, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
>> +"44 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm7,%xmm8",},
>> +{{0x45, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
>> +"45 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm15,%xmm8",},
>> +{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
>> +"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%rax),%xmm0",},
>> +{{0x41, 0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 6, 0, "", "",
>> +"41 0f 3a cc 00 91    \tsha1rnds4 $0x91,(%r8),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 04 25 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
>> +"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%rax),%xmm3",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%rax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%rax),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
>> +"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%rbp),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
>> +"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
>> +"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
>> +"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
>> +"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm0",},
>> +{{0x44, 0x0f, 0x3a, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 11, 0, "", "",
>> +"44 0f 3a cc bc c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm15",},
>> +{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
>> +"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
>> +"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
>> +{{0x41, 0x0f, 0x38, 0xc8, 0xc0, }, 5, 0, "", "",
>> +"41 0f 38 c8 c0       \tsha1nexte %xmm8,%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
>> +"44 0f 38 c8 c7       \tsha1nexte %xmm7,%xmm8",},
>> +{{0x45, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
>> +"45 0f 38 c8 c7       \tsha1nexte %xmm15,%xmm8",},
>> +{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
>> +"0f 38 c8 00          \tsha1nexte (%rax),%xmm0",},
>> +{{0x41, 0x0f, 0x38, 0xc8, 0x00, }, 5, 0, "", "",
>> +"41 0f 38 c8 00       \tsha1nexte (%r8),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 04 25 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
>> +"0f 38 c8 18          \tsha1nexte (%rax),%xmm3",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 c8 04 01       \tsha1nexte (%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 c8 04 08       \tsha1nexte (%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 c8 04 c8       \tsha1nexte (%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 c8 40 12       \tsha1nexte 0x12(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 c8 45 12       \tsha1nexte 0x12(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 01 12    \tsha1nexte 0x12(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 05 12    \tsha1nexte 0x12(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 08 12    \tsha1nexte 0x12(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xc8, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
>> +"44 0f 38 c8 bc c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm15",},
>> +{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
>> +"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
>> +"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
>> +{{0x41, 0x0f, 0x38, 0xc9, 0xc0, }, 5, 0, "", "",
>> +"41 0f 38 c9 c0       \tsha1msg1 %xmm8,%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
>> +"44 0f 38 c9 c7       \tsha1msg1 %xmm7,%xmm8",},
>> +{{0x45, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
>> +"45 0f 38 c9 c7       \tsha1msg1 %xmm15,%xmm8",},
>> +{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
>> +"0f 38 c9 00          \tsha1msg1 (%rax),%xmm0",},
>> +{{0x41, 0x0f, 0x38, 0xc9, 0x00, }, 5, 0, "", "",
>> +"41 0f 38 c9 00       \tsha1msg1 (%r8),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 04 25 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
>> +"0f 38 c9 18          \tsha1msg1 (%rax),%xmm3",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 c9 04 01       \tsha1msg1 (%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 c9 04 08       \tsha1msg1 (%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 c9 04 c8       \tsha1msg1 (%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 c9 40 12       \tsha1msg1 0x12(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 c9 45 12       \tsha1msg1 0x12(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 01 12    \tsha1msg1 0x12(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 05 12    \tsha1msg1 0x12(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 08 12    \tsha1msg1 0x12(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xc9, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
>> +"44 0f 38 c9 bc c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm15",},
>> +{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
>> +"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
>> +"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
>> +{{0x41, 0x0f, 0x38, 0xca, 0xc0, }, 5, 0, "", "",
>> +"41 0f 38 ca c0       \tsha1msg2 %xmm8,%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
>> +"44 0f 38 ca c7       \tsha1msg2 %xmm7,%xmm8",},
>> +{{0x45, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
>> +"45 0f 38 ca c7       \tsha1msg2 %xmm15,%xmm8",},
>> +{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
>> +"0f 38 ca 00          \tsha1msg2 (%rax),%xmm0",},
>> +{{0x41, 0x0f, 0x38, 0xca, 0x00, }, 5, 0, "", "",
>> +"41 0f 38 ca 00       \tsha1msg2 (%r8),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 04 25 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
>> +"0f 38 ca 18          \tsha1msg2 (%rax),%xmm3",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 ca 04 01       \tsha1msg2 (%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 ca 04 08       \tsha1msg2 (%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 ca 04 c8       \tsha1msg2 (%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 ca 40 12       \tsha1msg2 0x12(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 ca 45 12       \tsha1msg2 0x12(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 01 12    \tsha1msg2 0x12(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 05 12    \tsha1msg2 0x12(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 08 12    \tsha1msg2 0x12(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xca, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
>> +"44 0f 38 ca bc c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm15",},
>> +{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
>> +"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
>> +"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
>> +{{0x41, 0x0f, 0x38, 0xcb, 0xc8, }, 5, 0, "", "",
>> +"41 0f 38 cb c8       \tsha256rnds2 %xmm0,%xmm8,%xmm1",},
>> +{{0x44, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
>> +"44 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm7,%xmm8",},
>> +{{0x45, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
>> +"45 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm15,%xmm8",},
>> +{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
>> +"0f 38 cb 08          \tsha256rnds2 %xmm0,(%rax),%xmm1",},
>> +{{0x41, 0x0f, 0x38, 0xcb, 0x08, }, 5, 0, "", "",
>> +"41 0f 38 cb 08       \tsha256rnds2 %xmm0,(%r8),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 0c 25 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
>> +"0f 38 cb 18          \tsha256rnds2 %xmm0,(%rax),%xmm3",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
>> +"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%rcx,%rax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%rax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
>> +"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%rax,%rcx,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
>> +"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%rax,%rcx,8),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
>> +"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%rax),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
>> +"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%rbp),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%rcx,%rax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%rbp,%rax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,8),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rcx,%rax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp,%rax,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,1),%xmm1",},
>> +{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm1",},
>> +{{0x44, 0x0f, 0x38, 0xcb, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
>> +"44 0f 38 cb bc c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm15",},
>> +{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
>> +"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
>> +"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
>> +{{0x41, 0x0f, 0x38, 0xcc, 0xc0, }, 5, 0, "", "",
>> +"41 0f 38 cc c0       \tsha256msg1 %xmm8,%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
>> +"44 0f 38 cc c7       \tsha256msg1 %xmm7,%xmm8",},
>> +{{0x45, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
>> +"45 0f 38 cc c7       \tsha256msg1 %xmm15,%xmm8",},
>> +{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
>> +"0f 38 cc 00          \tsha256msg1 (%rax),%xmm0",},
>> +{{0x41, 0x0f, 0x38, 0xcc, 0x00, }, 5, 0, "", "",
>> +"41 0f 38 cc 00       \tsha256msg1 (%r8),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 04 25 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
>> +"0f 38 cc 18          \tsha256msg1 (%rax),%xmm3",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 cc 04 01       \tsha256msg1 (%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 cc 04 08       \tsha256msg1 (%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 cc 04 c8       \tsha256msg1 (%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 cc 40 12       \tsha256msg1 0x12(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 cc 45 12       \tsha256msg1 0x12(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 01 12    \tsha256msg1 0x12(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 05 12    \tsha256msg1 0x12(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 08 12    \tsha256msg1 0x12(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
>> +"44 0f 38 cc bc c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm15",},
>> +{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
>> +"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
>> +"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
>> +{{0x41, 0x0f, 0x38, 0xcd, 0xc0, }, 5, 0, "", "",
>> +"41 0f 38 cd c0       \tsha256msg2 %xmm8,%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
>> +"44 0f 38 cd c7       \tsha256msg2 %xmm7,%xmm8",},
>> +{{0x45, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
>> +"45 0f 38 cd c7       \tsha256msg2 %xmm15,%xmm8",},
>> +{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
>> +"0f 38 cd 00          \tsha256msg2 (%rax),%xmm0",},
>> +{{0x41, 0x0f, 0x38, 0xcd, 0x00, }, 5, 0, "", "",
>> +"41 0f 38 cd 00       \tsha256msg2 (%r8),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 04 25 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
>> +"0f 38 cd 18          \tsha256msg2 (%rax),%xmm3",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
>> +"0f 38 cd 04 01       \tsha256msg2 (%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
>> +"0f 38 cd 04 08       \tsha256msg2 (%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
>> +"0f 38 cd 04 c8       \tsha256msg2 (%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
>> +"0f 38 cd 40 12       \tsha256msg2 0x12(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
>> +"0f 38 cd 45 12       \tsha256msg2 0x12(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 01 12    \tsha256msg2 0x12(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 05 12    \tsha256msg2 0x12(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 08 12    \tsha256msg2 0x12(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
>> +"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%rax,%rcx,8),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%rax),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
>> +"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%rbp),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%rcx,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%rbp,%rax,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,1),%xmm0",},
>> +{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
>> +"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm0",},
>> +{{0x44, 0x0f, 0x38, 0xcd, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
>> +"44 0f 38 cd bc c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm15",},
>> diff --git a/tools/perf/tests/insn-x86-dat-src.c b/tools/perf/tests/insn-x86-dat-src.c
>> index b506830f33a8..7d06c9b22070 100644
>> --- a/tools/perf/tests/insn-x86-dat-src.c
>> +++ b/tools/perf/tests/insn-x86-dat-src.c
>> @@ -217,6 +217,210 @@ int main(void)
>>   	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
>>   	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0 */
>>
>> +	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
>> +	asm volatile("sha1rnds4 $0x91, %xmm8, %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm8");
>> +	asm volatile("sha1rnds4 $0x91, %xmm15, %xmm8");
>> +	asm volatile("sha1rnds4 $0x91, (%rax), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (%r8), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (%rax), %xmm3");
>> +	asm volatile("sha1rnds4 $0x91, (%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(,%rax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%rbp), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm15");
>> +
>> +	/* sha1nexte xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1nexte %xmm1, %xmm0");
>> +	asm volatile("sha1nexte %xmm7, %xmm2");
>> +	asm volatile("sha1nexte %xmm8, %xmm0");
>> +	asm volatile("sha1nexte %xmm7, %xmm8");
>> +	asm volatile("sha1nexte %xmm15, %xmm8");
>> +	asm volatile("sha1nexte (%rax), %xmm0");
>> +	asm volatile("sha1nexte (%r8), %xmm0");
>> +	asm volatile("sha1nexte (0x12345678), %xmm0");
>> +	asm volatile("sha1nexte (%rax), %xmm3");
>> +	asm volatile("sha1nexte (%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(,%rax,1), %xmm0");
>> +	asm volatile("sha1nexte (%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1nexte (%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%rax), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%rbp), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%rax), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%rbp), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm15");
>> +
>> +	/* sha1msg1 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1msg1 %xmm1, %xmm0");
>> +	asm volatile("sha1msg1 %xmm7, %xmm2");
>> +	asm volatile("sha1msg1 %xmm8, %xmm0");
>> +	asm volatile("sha1msg1 %xmm7, %xmm8");
>> +	asm volatile("sha1msg1 %xmm15, %xmm8");
>> +	asm volatile("sha1msg1 (%rax), %xmm0");
>> +	asm volatile("sha1msg1 (%r8), %xmm0");
>> +	asm volatile("sha1msg1 (0x12345678), %xmm0");
>> +	asm volatile("sha1msg1 (%rax), %xmm3");
>> +	asm volatile("sha1msg1 (%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(,%rax,1), %xmm0");
>> +	asm volatile("sha1msg1 (%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1msg1 (%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%rax), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%rbp), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%rax), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%rbp), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm15");
>> +
>> +	/* sha1msg2 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1msg2 %xmm1, %xmm0");
>> +	asm volatile("sha1msg2 %xmm7, %xmm2");
>> +	asm volatile("sha1msg2 %xmm8, %xmm0");
>> +	asm volatile("sha1msg2 %xmm7, %xmm8");
>> +	asm volatile("sha1msg2 %xmm15, %xmm8");
>> +	asm volatile("sha1msg2 (%rax), %xmm0");
>> +	asm volatile("sha1msg2 (%r8), %xmm0");
>> +	asm volatile("sha1msg2 (0x12345678), %xmm0");
>> +	asm volatile("sha1msg2 (%rax), %xmm3");
>> +	asm volatile("sha1msg2 (%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(,%rax,1), %xmm0");
>> +	asm volatile("sha1msg2 (%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1msg2 (%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%rax), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%rbp), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%rax), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%rbp), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm15");
>> +
>> +	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
>> +	/* Note sha256rnds2 has an implicit operand 'xmm0' */
>> +
>> +	asm volatile("sha256rnds2 %xmm4, %xmm1");
>> +	asm volatile("sha256rnds2 %xmm7, %xmm2");
>> +	asm volatile("sha256rnds2 %xmm8, %xmm1");
>> +	asm volatile("sha256rnds2 %xmm7, %xmm8");
>> +	asm volatile("sha256rnds2 %xmm15, %xmm8");
>> +	asm volatile("sha256rnds2 (%rax), %xmm1");
>> +	asm volatile("sha256rnds2 (%r8), %xmm1");
>> +	asm volatile("sha256rnds2 (0x12345678), %xmm1");
>> +	asm volatile("sha256rnds2 (%rax), %xmm3");
>> +	asm volatile("sha256rnds2 (%rcx,%rax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(,%rax,1), %xmm1");
>> +	asm volatile("sha256rnds2 (%rax,%rcx,1), %xmm1");
>> +	asm volatile("sha256rnds2 (%rax,%rcx,8), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%rax), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%rbp), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%rcx,%rax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%rbp,%rax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%rax,%rcx,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%rax,%rcx,8), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%rax), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%rbp), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%rcx,%rax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%rbp,%rax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm15");
>> +
>> +	/* sha256msg1 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha256msg1 %xmm1, %xmm0");
>> +	asm volatile("sha256msg1 %xmm7, %xmm2");
>> +	asm volatile("sha256msg1 %xmm8, %xmm0");
>> +	asm volatile("sha256msg1 %xmm7, %xmm8");
>> +	asm volatile("sha256msg1 %xmm15, %xmm8");
>> +	asm volatile("sha256msg1 (%rax), %xmm0");
>> +	asm volatile("sha256msg1 (%r8), %xmm0");
>> +	asm volatile("sha256msg1 (0x12345678), %xmm0");
>> +	asm volatile("sha256msg1 (%rax), %xmm3");
>> +	asm volatile("sha256msg1 (%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(,%rax,1), %xmm0");
>> +	asm volatile("sha256msg1 (%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha256msg1 (%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%rax), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%rbp), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%rax), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%rbp), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm15");
>> +
>> +	/* sha256msg2 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha256msg2 %xmm1, %xmm0");
>> +	asm volatile("sha256msg2 %xmm7, %xmm2");
>> +	asm volatile("sha256msg2 %xmm8, %xmm0");
>> +	asm volatile("sha256msg2 %xmm7, %xmm8");
>> +	asm volatile("sha256msg2 %xmm15, %xmm8");
>> +	asm volatile("sha256msg2 (%rax), %xmm0");
>> +	asm volatile("sha256msg2 (%r8), %xmm0");
>> +	asm volatile("sha256msg2 (0x12345678), %xmm0");
>> +	asm volatile("sha256msg2 (%rax), %xmm3");
>> +	asm volatile("sha256msg2 (%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(,%rax,1), %xmm0");
>> +	asm volatile("sha256msg2 (%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha256msg2 (%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%rax), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%rbp), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%rax), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%rbp), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%rcx,%rax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%rbp,%rax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm15");
>> +
>>   #else  /* #ifdef __x86_64__ */
>>
>>   	/* bndmk m32, bnd */
>> @@ -407,6 +611,175 @@ int main(void)
>>   	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
>>   	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0xfffffffc */
>>
>> +	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
>> +	asm volatile("sha1rnds4 $0x91, (%eax), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (%eax), %xmm3");
>> +	asm volatile("sha1rnds4 $0x91, (%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(,%eax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%ebp), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,8), %xmm0");
>> +
>> +	/* sha1nexte xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1nexte %xmm1, %xmm0");
>> +	asm volatile("sha1nexte %xmm7, %xmm2");
>> +	asm volatile("sha1nexte (%eax), %xmm0");
>> +	asm volatile("sha1nexte (0x12345678), %xmm0");
>> +	asm volatile("sha1nexte (%eax), %xmm3");
>> +	asm volatile("sha1nexte (%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(,%eax,1), %xmm0");
>> +	asm volatile("sha1nexte (%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1nexte (%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%eax), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%ebp), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12(%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%eax), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%ebp), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1nexte 0x12345678(%eax,%ecx,8), %xmm0");
>> +
>> +	/* sha1msg1 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1msg1 %xmm1, %xmm0");
>> +	asm volatile("sha1msg1 %xmm7, %xmm2");
>> +	asm volatile("sha1msg1 (%eax), %xmm0");
>> +	asm volatile("sha1msg1 (0x12345678), %xmm0");
>> +	asm volatile("sha1msg1 (%eax), %xmm3");
>> +	asm volatile("sha1msg1 (%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(,%eax,1), %xmm0");
>> +	asm volatile("sha1msg1 (%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1msg1 (%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%eax), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%ebp), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12(%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%eax), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%ebp), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1msg1 0x12345678(%eax,%ecx,8), %xmm0");
>> +
>> +	/* sha1msg2 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha1msg2 %xmm1, %xmm0");
>> +	asm volatile("sha1msg2 %xmm7, %xmm2");
>> +	asm volatile("sha1msg2 (%eax), %xmm0");
>> +	asm volatile("sha1msg2 (0x12345678), %xmm0");
>> +	asm volatile("sha1msg2 (%eax), %xmm3");
>> +	asm volatile("sha1msg2 (%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(,%eax,1), %xmm0");
>> +	asm volatile("sha1msg2 (%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1msg2 (%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%eax), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%ebp), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12(%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%eax), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%ebp), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha1msg2 0x12345678(%eax,%ecx,8), %xmm0");
>> +
>> +	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
>> +	/* Note sha256rnds2 has an implicit operand 'xmm0' */
>> +
>> +	asm volatile("sha256rnds2 %xmm4, %xmm1");
>> +	asm volatile("sha256rnds2 %xmm7, %xmm2");
>> +	asm volatile("sha256rnds2 (%eax), %xmm1");
>> +	asm volatile("sha256rnds2 (0x12345678), %xmm1");
>> +	asm volatile("sha256rnds2 (%eax), %xmm3");
>> +	asm volatile("sha256rnds2 (%ecx,%eax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(,%eax,1), %xmm1");
>> +	asm volatile("sha256rnds2 (%eax,%ecx,1), %xmm1");
>> +	asm volatile("sha256rnds2 (%eax,%ecx,8), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%eax), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%ebp), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%ecx,%eax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%ebp,%eax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%eax,%ecx,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12(%eax,%ecx,8), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%eax), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%ebp), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%ecx,%eax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%ebp,%eax,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,1), %xmm1");
>> +	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,8), %xmm1");
>> +
>> +	/* sha256msg1 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha256msg1 %xmm1, %xmm0");
>> +	asm volatile("sha256msg1 %xmm7, %xmm2");
>> +	asm volatile("sha256msg1 (%eax), %xmm0");
>> +	asm volatile("sha256msg1 (0x12345678), %xmm0");
>> +	asm volatile("sha256msg1 (%eax), %xmm3");
>> +	asm volatile("sha256msg1 (%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(,%eax,1), %xmm0");
>> +	asm volatile("sha256msg1 (%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha256msg1 (%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%eax), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%ebp), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12(%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%eax), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%ebp), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha256msg1 0x12345678(%eax,%ecx,8), %xmm0");
>> +
>> +	/* sha256msg2 xmm2/m128, xmm1 */
>> +
>> +	asm volatile("sha256msg2 %xmm1, %xmm0");
>> +	asm volatile("sha256msg2 %xmm7, %xmm2");
>> +	asm volatile("sha256msg2 (%eax), %xmm0");
>> +	asm volatile("sha256msg2 (0x12345678), %xmm0");
>> +	asm volatile("sha256msg2 (%eax), %xmm3");
>> +	asm volatile("sha256msg2 (%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(,%eax,1), %xmm0");
>> +	asm volatile("sha256msg2 (%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha256msg2 (%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%eax), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%ebp), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12(%eax,%ecx,8), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%eax), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%ebp), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%ecx,%eax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%ebp,%eax,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%eax,%ecx,1), %xmm0");
>> +	asm volatile("sha256msg2 0x12345678(%eax,%ecx,8), %xmm0");
>> +
>>   #endif /* #ifndef __x86_64__ */
>>
>>   	/* Following line is a marker for the awk script - do not change */
>> diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
>> index a02a195d219c..25dad388b371 100644
>> --- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
>> +++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
>> @@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
>>   be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
>>   bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
>>   # 0x0f 0x38 0xc0-0xff
>> +c8: sha1nexte Vdq,Wdq
>> +c9: sha1msg1 Vdq,Wdq
>> +ca: sha1msg2 Vdq,Wdq
>> +cb: sha256rnds2 Vdq,Wdq
>> +cc: sha256msg1 Vdq,Wdq
>> +cd: sha256msg2 Vdq,Wdq
>>   db: VAESIMC Vdq,Wdq (66),(v1)
>>   dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
>>   dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
>> @@ -794,6 +800,7 @@ AVXcode: 3
>>   61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
>>   62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
>>   63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
>> +cc: sha1rnds4 Vdq,Wdq,Ib
>>   df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
>>   f0: RORX Gy,Ey,Ib (F2),(v)
>>   EndTable
>> --
>> 1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions
  2015-08-31 13:58 ` [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions Adrian Hunter
  2015-08-31 14:50   ` Arnaldo Carvalho de Melo
@ 2015-09-01  0:08   ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 0 replies; 27+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-01  0:08 UTC (permalink / raw)
  To: 'Adrian Hunter', Arnaldo Carvalho de Melo
  Cc: linux-kernel@vger.kernel.org, Jiri Olsa, Andy Lutomirski,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 65212 bytes --]

> From: Adrian Hunter [mailto:adrian.hunter@intel.com]
> 
> Intel SHA Extensions are explained in the Intel Architecture
> Instruction Set Extensions Programing Reference (Oct 2014).
> There are 7 new instructions.  Add them to the op code map
> and the perf tools new instructions test. e.g.
> 
>     $ tools/perf/perf test list 2>&1 | grep "x86 ins"
>     39: Test x86 instruction decoder - new instructions
>     $ tools/perf/perf test 39
>     39: Test x86 instruction decoder - new instructions          : Ok
> 
> Or to see the details:
> 
>     $ tools/perf/perf test -v 39 2>&1 | grep sha

OK, looks fine to me :)

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thanks!

> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  arch/x86/lib/x86-opcode-map.txt                    |   7 +
>  tools/perf/tests/insn-x86-dat-32.c                 | 294 ++++++++++++++++
>  tools/perf/tests/insn-x86-dat-64.c                 | 364 ++++++++++++++++++++
>  tools/perf/tests/insn-x86-dat-src.c                | 373 +++++++++++++++++++++
>  .../perf/util/intel-pt-decoder/x86-opcode-map.txt  |   7 +
>  5 files changed, 1045 insertions(+)
> 
> diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
> index a02a195d219c..25dad388b371 100644
> --- a/arch/x86/lib/x86-opcode-map.txt
> +++ b/arch/x86/lib/x86-opcode-map.txt
> @@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
>  be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
>  bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
>  # 0x0f 0x38 0xc0-0xff
> +c8: sha1nexte Vdq,Wdq
> +c9: sha1msg1 Vdq,Wdq
> +ca: sha1msg2 Vdq,Wdq
> +cb: sha256rnds2 Vdq,Wdq
> +cc: sha256msg1 Vdq,Wdq
> +cd: sha256msg2 Vdq,Wdq
>  db: VAESIMC Vdq,Wdq (66),(v1)
>  dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
>  dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
> @@ -794,6 +800,7 @@ AVXcode: 3
>  61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
>  62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
>  63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
> +cc: sha1rnds4 Vdq,Wdq,Ib
>  df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
>  f0: RORX Gy,Ey,Ib (F2),(v)
>  EndTable
> diff --git a/tools/perf/tests/insn-x86-dat-32.c b/tools/perf/tests/insn-x86-dat-32.c
> index 6a38a34a5a49..83f5078e74e1 100644
> --- a/tools/perf/tests/insn-x86-dat-32.c
> +++ b/tools/perf/tests/insn-x86-dat-32.c
> @@ -322,3 +322,297 @@
>  "f2 ff 21             \tbnd jmp *(%ecx)",},
>  {{0xf2, 0x0f, 0x85, 0xfc, 0xff, 0xff, 0xff, }, 7, 0xfffffffc, "jcc", "conditional",
>  "f2 0f 85 fc ff ff ff \tbnd jne 3de <main+0x3de>",},
> +{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
> +"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
> +"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
> +{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%eax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%eax),%xmm3",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%eax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%ebp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
> +"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
> +"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
> +"0f 38 c8 00          \tsha1nexte (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 05 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
> +"0f 38 c8 18          \tsha1nexte (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c8 04 01       \tsha1nexte (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c8 04 08       \tsha1nexte (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c8 04 c8       \tsha1nexte (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 40 12       \tsha1nexte 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 45 12       \tsha1nexte 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 01 12    \tsha1nexte 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 05 12    \tsha1nexte 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 08 12    \tsha1nexte 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
> +"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
> +"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
> +"0f 38 c9 00          \tsha1msg1 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 05 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
> +"0f 38 c9 18          \tsha1msg1 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c9 04 01       \tsha1msg1 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c9 04 08       \tsha1msg1 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c9 04 c8       \tsha1msg1 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 40 12       \tsha1msg1 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 45 12       \tsha1msg1 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 01 12    \tsha1msg1 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 05 12    \tsha1msg1 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 08 12    \tsha1msg1 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
> +"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
> +"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
> +"0f 38 ca 00          \tsha1msg2 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 05 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
> +"0f 38 ca 18          \tsha1msg2 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 ca 04 01       \tsha1msg2 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 ca 04 08       \tsha1msg2 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 ca 04 c8       \tsha1msg2 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 40 12       \tsha1msg2 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 45 12       \tsha1msg2 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 01 12    \tsha1msg2 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 05 12    \tsha1msg2 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 08 12    \tsha1msg2 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
> +"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
> +"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
> +"0f 38 cb 08          \tsha256rnds2 %xmm0,(%eax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 0d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
> +"0f 38 cb 18          \tsha256rnds2 %xmm0,(%eax),%xmm3",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
> +"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%ecx,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
> +"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%eax,%ecx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
> +"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%eax,%ecx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%eax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%ebp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%ecx,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%ebp,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%eax,%ecx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ecx,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%ebp,%eax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%eax,%ecx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
> +"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
> +"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
> +"0f 38 cc 00          \tsha256msg1 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 05 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
> +"0f 38 cc 18          \tsha256msg1 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cc 04 01       \tsha256msg1 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cc 04 08       \tsha256msg1 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cc 04 c8       \tsha256msg1 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 40 12       \tsha256msg1 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 45 12       \tsha256msg1 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 01 12    \tsha256msg1 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 05 12    \tsha256msg1 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 08 12    \tsha256msg1 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
> +"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
> +"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
> +{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
> +"0f 38 cd 00          \tsha256msg2 (%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 05 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
> +"0f 38 cd 18          \tsha256msg2 (%eax),%xmm3",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cd 04 01       \tsha256msg2 (%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cd 04 08       \tsha256msg2 (%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cd 04 c8       \tsha256msg2 (%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 40 12       \tsha256msg2 0x12(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 45 12       \tsha256msg2 0x12(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 01 12    \tsha256msg2 0x12(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 05 12    \tsha256msg2 0x12(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 08 12    \tsha256msg2 0x12(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%eax,%ecx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%eax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%ebp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%ecx,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%ebp,%eax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%eax,%ecx,8),%xmm0",},
> diff --git a/tools/perf/tests/insn-x86-dat-64.c b/tools/perf/tests/insn-x86-dat-64.c
> index 01122421a776..13f008588590 100644
> --- a/tools/perf/tests/insn-x86-dat-64.c
> +++ b/tools/perf/tests/insn-x86-dat-64.c
> @@ -338,3 +338,367 @@
>  "67 f2 ff 21          \tbnd jmpq *(%ecx)",},
>  {{0xf2, 0x0f, 0x85, 0x00, 0x00, 0x00, 0x00, }, 7, 0, "jcc", "conditional",
>  "f2 0f 85 00 00 00 00 \tbnd jne 413 <main+0x413>",},
> +{{0x0f, 0x3a, 0xcc, 0xc1, 0x00, }, 5, 0, "", "",
> +"0f 3a cc c1 00       \tsha1rnds4 $0x0,%xmm1,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0xd7, 0x91, }, 5, 0, "", "",
> +"0f 3a cc d7 91       \tsha1rnds4 $0x91,%xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x3a, 0xcc, 0xc0, 0x91, }, 6, 0, "", "",
> +"41 0f 3a cc c0 91    \tsha1rnds4 $0x91,%xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
> +"44 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x3a, 0xcc, 0xc7, 0x91, }, 6, 0, "", "",
> +"45 0f 3a cc c7 91    \tsha1rnds4 $0x91,%xmm15,%xmm8",},
> +{{0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 00 91       \tsha1rnds4 $0x91,(%rax),%xmm0",},
> +{{0x41, 0x0f, 0x3a, 0xcc, 0x00, 0x91, }, 6, 0, "", "",
> +"41 0f 3a cc 00 91    \tsha1rnds4 $0x91,(%r8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 04 25 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678,%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x18, 0x91, }, 5, 0, "", "",
> +"0f 3a cc 18 91       \tsha1rnds4 $0x91,(%rax),%xmm3",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x01, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 01 91    \tsha1rnds4 $0x91,(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 04 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0x08, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 08 91    \tsha1rnds4 $0x91,(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x04, 0xc8, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 04 c8 91    \tsha1rnds4 $0x91,(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x40, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 40 12 91    \tsha1rnds4 $0x91,0x12(%rax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x45, 0x12, 0x91, }, 6, 0, "", "",
> +"0f 3a cc 45 12 91    \tsha1rnds4 $0x91,0x12(%rbp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x01, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 01 12 91 \tsha1rnds4 $0x91,0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x05, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 05 12 91 \tsha1rnds4 $0x91,0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0x08, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 08 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x44, 0xc8, 0x12, 0x91, }, 7, 0, "", "",
> +"0f 3a cc 44 c8 12 91 \tsha1rnds4 $0x91,0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 80 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, 0x91, }, 9, 0, "", "",
> +"0f 3a cc 85 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 01 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 05 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 08 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x3a, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 10, 0, "", "",
> +"0f 3a cc 84 c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x3a, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, 0x91, }, 11, 0, "", "",
> +"44 0f 3a cc bc c8 78 56 34 12 91 \tsha1rnds4 $0x91,0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xc8, 0xc1, }, 4, 0, "", "",
> +"0f 38 c8 c1          \tsha1nexte %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0xd7, }, 4, 0, "", "",
> +"0f 38 c8 d7          \tsha1nexte %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xc8, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 c8 c0       \tsha1nexte %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 c8 c7       \tsha1nexte %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xc8, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 c8 c7       \tsha1nexte %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xc8, 0x00, }, 4, 0, "", "",
> +"0f 38 c8 00          \tsha1nexte (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xc8, 0x00, }, 5, 0, "", "",
> +"41 0f 38 c8 00       \tsha1nexte (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 04 25 78 56 34 12 \tsha1nexte 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x18, }, 4, 0, "", "",
> +"0f 38 c8 18          \tsha1nexte (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c8 04 01       \tsha1nexte (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 04 05 78 56 34 12 \tsha1nexte 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c8 04 08       \tsha1nexte (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c8 04 c8       \tsha1nexte (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 40 12       \tsha1nexte 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c8 45 12       \tsha1nexte 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 01 12    \tsha1nexte 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 05 12    \tsha1nexte 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 08 12    \tsha1nexte 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c8 44 c8 12    \tsha1nexte 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 80 78 56 34 12 \tsha1nexte 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c8 85 78 56 34 12 \tsha1nexte 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 01 78 56 34 12 \tsha1nexte 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 05 78 56 34 12 \tsha1nexte 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 08 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc8, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c8 84 c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc8, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 c8 bc c8 78 56 34 12 \tsha1nexte 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xc9, 0xc1, }, 4, 0, "", "",
> +"0f 38 c9 c1          \tsha1msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0xd7, }, 4, 0, "", "",
> +"0f 38 c9 d7          \tsha1msg1 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xc9, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 c9 c0       \tsha1msg1 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 c9 c7       \tsha1msg1 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xc9, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 c9 c7       \tsha1msg1 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xc9, 0x00, }, 4, 0, "", "",
> +"0f 38 c9 00          \tsha1msg1 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xc9, 0x00, }, 5, 0, "", "",
> +"41 0f 38 c9 00       \tsha1msg1 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 04 25 78 56 34 12 \tsha1msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x18, }, 4, 0, "", "",
> +"0f 38 c9 18          \tsha1msg1 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 c9 04 01       \tsha1msg1 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 04 05 78 56 34 12 \tsha1msg1 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 c9 04 08       \tsha1msg1 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 c9 04 c8       \tsha1msg1 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 40 12       \tsha1msg1 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 c9 45 12       \tsha1msg1 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 01 12    \tsha1msg1 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 05 12    \tsha1msg1 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 08 12    \tsha1msg1 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 c9 44 c8 12    \tsha1msg1 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 80 78 56 34 12 \tsha1msg1 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 c9 85 78 56 34 12 \tsha1msg1 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 01 78 56 34 12 \tsha1msg1 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 05 78 56 34 12 \tsha1msg1 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 08 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xc9, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 c9 84 c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xc9, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 c9 bc c8 78 56 34 12 \tsha1msg1 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xca, 0xc1, }, 4, 0, "", "",
> +"0f 38 ca c1          \tsha1msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0xd7, }, 4, 0, "", "",
> +"0f 38 ca d7          \tsha1msg2 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xca, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 ca c0       \tsha1msg2 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 ca c7       \tsha1msg2 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xca, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 ca c7       \tsha1msg2 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xca, 0x00, }, 4, 0, "", "",
> +"0f 38 ca 00          \tsha1msg2 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xca, 0x00, }, 5, 0, "", "",
> +"41 0f 38 ca 00       \tsha1msg2 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 04 25 78 56 34 12 \tsha1msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x18, }, 4, 0, "", "",
> +"0f 38 ca 18          \tsha1msg2 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 ca 04 01       \tsha1msg2 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 04 05 78 56 34 12 \tsha1msg2 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 ca 04 08       \tsha1msg2 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 ca 04 c8       \tsha1msg2 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 40 12       \tsha1msg2 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 ca 45 12       \tsha1msg2 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 01 12    \tsha1msg2 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 05 12    \tsha1msg2 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 08 12    \tsha1msg2 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 ca 44 c8 12    \tsha1msg2 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 80 78 56 34 12 \tsha1msg2 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 ca 85 78 56 34 12 \tsha1msg2 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 01 78 56 34 12 \tsha1msg2 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 05 78 56 34 12 \tsha1msg2 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 08 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xca, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 ca 84 c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xca, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 ca bc c8 78 56 34 12 \tsha1msg2 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xcb, 0xcc, }, 4, 0, "", "",
> +"0f 38 cb cc          \tsha256rnds2 %xmm0,%xmm4,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0xd7, }, 4, 0, "", "",
> +"0f 38 cb d7          \tsha256rnds2 %xmm0,%xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xcb, 0xc8, }, 5, 0, "", "",
> +"41 0f 38 cb c8       \tsha256rnds2 %xmm0,%xmm8,%xmm1",},
> +{{0x44, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xcb, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 cb c7       \tsha256rnds2 %xmm0,%xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xcb, 0x08, }, 4, 0, "", "",
> +"0f 38 cb 08          \tsha256rnds2 %xmm0,(%rax),%xmm1",},
> +{{0x41, 0x0f, 0x38, 0xcb, 0x08, }, 5, 0, "", "",
> +"41 0f 38 cb 08       \tsha256rnds2 %xmm0,(%r8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 0c 25 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678,%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x18, }, 4, 0, "", "",
> +"0f 38 cb 18          \tsha256rnds2 %xmm0,(%rax),%xmm3",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x01, }, 5, 0, "", "",
> +"0f 38 cb 0c 01       \tsha256rnds2 %xmm0,(%rcx,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 0c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0x08, }, 5, 0, "", "",
> +"0f 38 cb 0c 08       \tsha256rnds2 %xmm0,(%rax,%rcx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x0c, 0xc8, }, 5, 0, "", "",
> +"0f 38 cb 0c c8       \tsha256rnds2 %xmm0,(%rax,%rcx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x48, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 48 12       \tsha256rnds2 %xmm0,0x12(%rax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4d, 0x12, }, 5, 0, "", "",
> +"0f 38 cb 4d 12       \tsha256rnds2 %xmm0,0x12(%rbp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 01 12    \tsha256rnds2 %xmm0,0x12(%rcx,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 05 12    \tsha256rnds2 %xmm0,0x12(%rbp,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c 08 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x4c, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cb 4c c8 12    \tsha256rnds2 %xmm0,0x12(%rax,%rcx,8),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x88, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 88 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8d, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cb 8d 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 01 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rcx,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 05 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rbp,%rax,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c 08 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,1),%xmm1",},
> +{{0x0f, 0x38, 0xcb, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cb 8c c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm1",},
> +{{0x44, 0x0f, 0x38, 0xcb, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 cb bc c8 78 56 34 12 \tsha256rnds2 %xmm0,0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xcc, 0xc1, }, 4, 0, "", "",
> +"0f 38 cc c1          \tsha256msg1 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0xd7, }, 4, 0, "", "",
> +"0f 38 cc d7          \tsha256msg1 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xcc, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 cc c0       \tsha256msg1 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 cc c7       \tsha256msg1 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xcc, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 cc c7       \tsha256msg1 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xcc, 0x00, }, 4, 0, "", "",
> +"0f 38 cc 00          \tsha256msg1 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xcc, 0x00, }, 5, 0, "", "",
> +"41 0f 38 cc 00       \tsha256msg1 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 04 25 78 56 34 12 \tsha256msg1 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x18, }, 4, 0, "", "",
> +"0f 38 cc 18          \tsha256msg1 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cc 04 01       \tsha256msg1 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 04 05 78 56 34 12 \tsha256msg1 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cc 04 08       \tsha256msg1 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cc 04 c8       \tsha256msg1 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 40 12       \tsha256msg1 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cc 45 12       \tsha256msg1 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 01 12    \tsha256msg1 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 05 12    \tsha256msg1 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 08 12    \tsha256msg1 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cc 44 c8 12    \tsha256msg1 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 80 78 56 34 12 \tsha256msg1 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cc 85 78 56 34 12 \tsha256msg1 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 01 78 56 34 12 \tsha256msg1 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 05 78 56 34 12 \tsha256msg1 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 08 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcc, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cc 84 c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcc, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 cc bc c8 78 56 34 12 \tsha256msg1 0x12345678(%rax,%rcx,8),%xmm15",},
> +{{0x0f, 0x38, 0xcd, 0xc1, }, 4, 0, "", "",
> +"0f 38 cd c1          \tsha256msg2 %xmm1,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0xd7, }, 4, 0, "", "",
> +"0f 38 cd d7          \tsha256msg2 %xmm7,%xmm2",},
> +{{0x41, 0x0f, 0x38, 0xcd, 0xc0, }, 5, 0, "", "",
> +"41 0f 38 cd c0       \tsha256msg2 %xmm8,%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
> +"44 0f 38 cd c7       \tsha256msg2 %xmm7,%xmm8",},
> +{{0x45, 0x0f, 0x38, 0xcd, 0xc7, }, 5, 0, "", "",
> +"45 0f 38 cd c7       \tsha256msg2 %xmm15,%xmm8",},
> +{{0x0f, 0x38, 0xcd, 0x00, }, 4, 0, "", "",
> +"0f 38 cd 00          \tsha256msg2 (%rax),%xmm0",},
> +{{0x41, 0x0f, 0x38, 0xcd, 0x00, }, 5, 0, "", "",
> +"41 0f 38 cd 00       \tsha256msg2 (%r8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 04 25 78 56 34 12 \tsha256msg2 0x12345678,%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x18, }, 4, 0, "", "",
> +"0f 38 cd 18          \tsha256msg2 (%rax),%xmm3",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x01, }, 5, 0, "", "",
> +"0f 38 cd 04 01       \tsha256msg2 (%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 04 05 78 56 34 12 \tsha256msg2 0x12345678(,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0x08, }, 5, 0, "", "",
> +"0f 38 cd 04 08       \tsha256msg2 (%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x04, 0xc8, }, 5, 0, "", "",
> +"0f 38 cd 04 c8       \tsha256msg2 (%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x40, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 40 12       \tsha256msg2 0x12(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x45, 0x12, }, 5, 0, "", "",
> +"0f 38 cd 45 12       \tsha256msg2 0x12(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 01 12    \tsha256msg2 0x12(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 05 12    \tsha256msg2 0x12(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 08 12    \tsha256msg2 0x12(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"0f 38 cd 44 c8 12    \tsha256msg2 0x12(%rax,%rcx,8),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 80 78 56 34 12 \tsha256msg2 0x12345678(%rax),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 38 cd 85 78 56 34 12 \tsha256msg2 0x12345678(%rbp),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 01 78 56 34 12 \tsha256msg2 0x12345678(%rcx,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 05 78 56 34 12 \tsha256msg2 0x12345678(%rbp,%rax,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 08 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,1),%xmm0",},
> +{{0x0f, 0x38, 0xcd, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"0f 38 cd 84 c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm0",},
> +{{0x44, 0x0f, 0x38, 0xcd, 0xbc, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "",
> +"44 0f 38 cd bc c8 78 56 34 12 \tsha256msg2 0x12345678(%rax,%rcx,8),%xmm15",},
> diff --git a/tools/perf/tests/insn-x86-dat-src.c b/tools/perf/tests/insn-x86-dat-src.c
> index b506830f33a8..7d06c9b22070 100644
> --- a/tools/perf/tests/insn-x86-dat-src.c
> +++ b/tools/perf/tests/insn-x86-dat-src.c
> @@ -217,6 +217,210 @@ int main(void)
>  	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
>  	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0 */
> 
> +	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
> +	asm volatile("sha1rnds4 $0x91, %xmm8, %xmm0");
> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm8");
> +	asm volatile("sha1rnds4 $0x91, %xmm15, %xmm8");
> +	asm volatile("sha1rnds4 $0x91, (%rax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%r8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%rax), %xmm3");
> +	asm volatile("sha1rnds4 $0x91, (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rbp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha1nexte xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1nexte %xmm1, %xmm0");
> +	asm volatile("sha1nexte %xmm7, %xmm2");
> +	asm volatile("sha1nexte %xmm8, %xmm0");
> +	asm volatile("sha1nexte %xmm7, %xmm8");
> +	asm volatile("sha1nexte %xmm15, %xmm8");
> +	asm volatile("sha1nexte (%rax), %xmm0");
> +	asm volatile("sha1nexte (%r8), %xmm0");
> +	asm volatile("sha1nexte (0x12345678), %xmm0");
> +	asm volatile("sha1nexte (%rax), %xmm3");
> +	asm volatile("sha1nexte (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1nexte (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1nexte (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rax), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rbp), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha1msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg1 %xmm1, %xmm0");
> +	asm volatile("sha1msg1 %xmm7, %xmm2");
> +	asm volatile("sha1msg1 %xmm8, %xmm0");
> +	asm volatile("sha1msg1 %xmm7, %xmm8");
> +	asm volatile("sha1msg1 %xmm15, %xmm8");
> +	asm volatile("sha1msg1 (%rax), %xmm0");
> +	asm volatile("sha1msg1 (%r8), %xmm0");
> +	asm volatile("sha1msg1 (0x12345678), %xmm0");
> +	asm volatile("sha1msg1 (%rax), %xmm3");
> +	asm volatile("sha1msg1 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg1 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rax), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rbp), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha1msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg2 %xmm1, %xmm0");
> +	asm volatile("sha1msg2 %xmm7, %xmm2");
> +	asm volatile("sha1msg2 %xmm8, %xmm0");
> +	asm volatile("sha1msg2 %xmm7, %xmm8");
> +	asm volatile("sha1msg2 %xmm15, %xmm8");
> +	asm volatile("sha1msg2 (%rax), %xmm0");
> +	asm volatile("sha1msg2 (%r8), %xmm0");
> +	asm volatile("sha1msg2 (0x12345678), %xmm0");
> +	asm volatile("sha1msg2 (%rax), %xmm3");
> +	asm volatile("sha1msg2 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg2 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rax), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rbp), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
> +	/* Note sha256rnds2 has an implicit operand 'xmm0' */
> +
> +	asm volatile("sha256rnds2 %xmm4, %xmm1");
> +	asm volatile("sha256rnds2 %xmm7, %xmm2");
> +	asm volatile("sha256rnds2 %xmm8, %xmm1");
> +	asm volatile("sha256rnds2 %xmm7, %xmm8");
> +	asm volatile("sha256rnds2 %xmm15, %xmm8");
> +	asm volatile("sha256rnds2 (%rax), %xmm1");
> +	asm volatile("sha256rnds2 (%r8), %xmm1");
> +	asm volatile("sha256rnds2 (0x12345678), %xmm1");
> +	asm volatile("sha256rnds2 (%rax), %xmm3");
> +	asm volatile("sha256rnds2 (%rcx,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 (%rax,%rcx,1), %xmm1");
> +	asm volatile("sha256rnds2 (%rax,%rcx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rax), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rbp), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rcx,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rbp,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rax,%rcx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%rax,%rcx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rbp), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rcx,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rbp,%rax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha256msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg1 %xmm1, %xmm0");
> +	asm volatile("sha256msg1 %xmm7, %xmm2");
> +	asm volatile("sha256msg1 %xmm8, %xmm0");
> +	asm volatile("sha256msg1 %xmm7, %xmm8");
> +	asm volatile("sha256msg1 %xmm15, %xmm8");
> +	asm volatile("sha256msg1 (%rax), %xmm0");
> +	asm volatile("sha256msg1 (%r8), %xmm0");
> +	asm volatile("sha256msg1 (0x12345678), %xmm0");
> +	asm volatile("sha256msg1 (%rax), %xmm3");
> +	asm volatile("sha256msg1 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg1 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rax), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rbp), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%rax,%rcx,8), %xmm15");
> +
> +	/* sha256msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg2 %xmm1, %xmm0");
> +	asm volatile("sha256msg2 %xmm7, %xmm2");
> +	asm volatile("sha256msg2 %xmm8, %xmm0");
> +	asm volatile("sha256msg2 %xmm7, %xmm8");
> +	asm volatile("sha256msg2 %xmm15, %xmm8");
> +	asm volatile("sha256msg2 (%rax), %xmm0");
> +	asm volatile("sha256msg2 (%r8), %xmm0");
> +	asm volatile("sha256msg2 (0x12345678), %xmm0");
> +	asm volatile("sha256msg2 (%rax), %xmm3");
> +	asm volatile("sha256msg2 (%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 (%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg2 (%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rax), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rbp), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rbp), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rcx,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rbp,%rax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%rax,%rcx,8), %xmm15");
> +
>  #else  /* #ifdef __x86_64__ */
> 
>  	/* bndmk m32, bnd */
> @@ -407,6 +611,175 @@ int main(void)
>  	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
>  	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0xfffffffc */
> 
> +	/* sha1rnds4 imm8, xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1rnds4 $0x0, %xmm1, %xmm0");
> +	asm volatile("sha1rnds4 $0x91, %xmm7, %xmm2");
> +	asm volatile("sha1rnds4 $0x91, (%eax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (0x12345678), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%eax), %xmm3");
> +	asm volatile("sha1rnds4 $0x91, (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%ebp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1rnds4 $0x91, 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha1nexte xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1nexte %xmm1, %xmm0");
> +	asm volatile("sha1nexte %xmm7, %xmm2");
> +	asm volatile("sha1nexte (%eax), %xmm0");
> +	asm volatile("sha1nexte (0x12345678), %xmm0");
> +	asm volatile("sha1nexte (%eax), %xmm3");
> +	asm volatile("sha1nexte (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1nexte (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1nexte (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12(%eax), %xmm0");
> +	asm volatile("sha1nexte 0x12(%ebp), %xmm0");
> +	asm volatile("sha1nexte 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1nexte 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha1msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg1 %xmm1, %xmm0");
> +	asm volatile("sha1msg1 %xmm7, %xmm2");
> +	asm volatile("sha1msg1 (%eax), %xmm0");
> +	asm volatile("sha1msg1 (0x12345678), %xmm0");
> +	asm volatile("sha1msg1 (%eax), %xmm3");
> +	asm volatile("sha1msg1 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg1 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12(%eax), %xmm0");
> +	asm volatile("sha1msg1 0x12(%ebp), %xmm0");
> +	asm volatile("sha1msg1 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg1 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha1msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha1msg2 %xmm1, %xmm0");
> +	asm volatile("sha1msg2 %xmm7, %xmm2");
> +	asm volatile("sha1msg2 (%eax), %xmm0");
> +	asm volatile("sha1msg2 (0x12345678), %xmm0");
> +	asm volatile("sha1msg2 (%eax), %xmm3");
> +	asm volatile("sha1msg2 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg2 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12(%eax), %xmm0");
> +	asm volatile("sha1msg2 0x12(%ebp), %xmm0");
> +	asm volatile("sha1msg2 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%eax), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha1msg2 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha256rnds2 <XMM0>, xmm2/m128, xmm1 */
> +	/* Note sha256rnds2 has an implicit operand 'xmm0' */
> +
> +	asm volatile("sha256rnds2 %xmm4, %xmm1");
> +	asm volatile("sha256rnds2 %xmm7, %xmm2");
> +	asm volatile("sha256rnds2 (%eax), %xmm1");
> +	asm volatile("sha256rnds2 (0x12345678), %xmm1");
> +	asm volatile("sha256rnds2 (%eax), %xmm3");
> +	asm volatile("sha256rnds2 (%ecx,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 (%eax,%ecx,1), %xmm1");
> +	asm volatile("sha256rnds2 (%eax,%ecx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%eax), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%ebp), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%ecx,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%ebp,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%eax,%ecx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12(%eax,%ecx,8), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%eax), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%ebp), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%ecx,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%ebp,%eax,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,1), %xmm1");
> +	asm volatile("sha256rnds2 0x12345678(%eax,%ecx,8), %xmm1");
> +
> +	/* sha256msg1 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg1 %xmm1, %xmm0");
> +	asm volatile("sha256msg1 %xmm7, %xmm2");
> +	asm volatile("sha256msg1 (%eax), %xmm0");
> +	asm volatile("sha256msg1 (0x12345678), %xmm0");
> +	asm volatile("sha256msg1 (%eax), %xmm3");
> +	asm volatile("sha256msg1 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg1 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12(%eax), %xmm0");
> +	asm volatile("sha256msg1 0x12(%ebp), %xmm0");
> +	asm volatile("sha256msg1 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%eax), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg1 0x12345678(%eax,%ecx,8), %xmm0");
> +
> +	/* sha256msg2 xmm2/m128, xmm1 */
> +
> +	asm volatile("sha256msg2 %xmm1, %xmm0");
> +	asm volatile("sha256msg2 %xmm7, %xmm2");
> +	asm volatile("sha256msg2 (%eax), %xmm0");
> +	asm volatile("sha256msg2 (0x12345678), %xmm0");
> +	asm volatile("sha256msg2 (%eax), %xmm3");
> +	asm volatile("sha256msg2 (%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 (%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg2 (%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12(%eax), %xmm0");
> +	asm volatile("sha256msg2 0x12(%ebp), %xmm0");
> +	asm volatile("sha256msg2 0x12(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12(%eax,%ecx,8), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%eax), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%ebp), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%ecx,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%ebp,%eax,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%eax,%ecx,1), %xmm0");
> +	asm volatile("sha256msg2 0x12345678(%eax,%ecx,8), %xmm0");
> +
>  #endif /* #ifndef __x86_64__ */
> 
>  	/* Following line is a marker for the awk script - do not change */
> diff --git a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> index a02a195d219c..25dad388b371 100644
> --- a/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> +++ b/tools/perf/util/intel-pt-decoder/x86-opcode-map.txt
> @@ -736,6 +736,12 @@ bd: vfnmadd231ss/d Vx,Hx,Wx (66),(v),(v1)
>  be: vfnmsub231ps/d Vx,Hx,Wx (66),(v)
>  bf: vfnmsub231ss/d Vx,Hx,Wx (66),(v),(v1)
>  # 0x0f 0x38 0xc0-0xff
> +c8: sha1nexte Vdq,Wdq
> +c9: sha1msg1 Vdq,Wdq
> +ca: sha1msg2 Vdq,Wdq
> +cb: sha256rnds2 Vdq,Wdq
> +cc: sha256msg1 Vdq,Wdq
> +cd: sha256msg2 Vdq,Wdq
>  db: VAESIMC Vdq,Wdq (66),(v1)
>  dc: VAESENC Vdq,Hdq,Wdq (66),(v1)
>  dd: VAESENCLAST Vdq,Hdq,Wdq (66),(v1)
> @@ -794,6 +800,7 @@ AVXcode: 3
>  61: vpcmpestri Vdq,Wdq,Ib (66),(v1)
>  62: vpcmpistrm Vdq,Wdq,Ib (66),(v1)
>  63: vpcmpistri Vdq,Wdq,Ib (66),(v1)
> +cc: sha1rnds4 Vdq,Wdq,Ib
>  df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1)
>  f0: RORX Gy,Ey,Ib (F2),(v)
>  EndTable
> --
> 1.9.1

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 1/4] perf tools: Add a test for decoding of new x86 instructions
  2015-08-31 13:58 ` [PATCH 1/4] perf tools: Add a test for decoding of " Adrian Hunter
@ 2015-09-01  0:18   ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-01  8:17     ` Adrian Hunter
  0 siblings, 1 reply; 27+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-01  0:18 UTC (permalink / raw)
  To: 'Adrian Hunter', Arnaldo Carvalho de Melo
  Cc: linux-kernel@vger.kernel.org, Jiri Olsa, Andy Lutomirski,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 71333 bytes --]

> From: Adrian Hunter [mailto:adrian.hunter@intel.com]
> 
> Add a new test titled:
> 
> 	Test x86 instruction decoder - new instructions
> 
> The purpose of this test is to check the instruction decoder
> after new instructions have been added.  Initially, MPX
> instructions are tested which are already supported, but the
> definitions in x86-opcode-map.txt will be tweaked in a
> subsequent patch, after which this test can be run to verify
> those changes.

Hmm, btw, why should this test in perf? It seems that we need
this test in kselftest or build-time selftest.
I prefer to put this in arch/x86/tools/ or lib/. What would you
think ?

Thanks,

> 
> The data for the test comes from assembly language instructions
> in insn-x86-dat-src.c which is converted into bytes by the scripts
> gen-insn-x86-dat.sh and gen-insn-x86-dat.awk, and included
> into the test program insn-x86.c as insn-x86-dat-32.c and
> insn-x86-dat-64.c.  The conversion is not done as part of the
> perf tools build because the test data must be under (git)
> change control in order for the test to be repeatably-correct.
> Also it may require a recent version of binutils.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/tests/Build                |   3 +
>  tools/perf/tests/builtin-test.c       |   8 +
>  tools/perf/tests/gen-insn-x86-dat.awk |  75 ++++++
>  tools/perf/tests/gen-insn-x86-dat.sh  |  43 ++++
>  tools/perf/tests/insn-x86-dat-32.c    | 324 ++++++++++++++++++++++++++
>  tools/perf/tests/insn-x86-dat-64.c    | 340 +++++++++++++++++++++++++++
>  tools/perf/tests/insn-x86-dat-src.c   | 416 ++++++++++++++++++++++++++++++++++
>  tools/perf/tests/insn-x86.c           | 180 +++++++++++++++
>  tools/perf/tests/tests.h              |   1 +
>  9 files changed, 1390 insertions(+)
>  create mode 100644 tools/perf/tests/gen-insn-x86-dat.awk
>  create mode 100755 tools/perf/tests/gen-insn-x86-dat.sh
>  create mode 100644 tools/perf/tests/insn-x86-dat-32.c
>  create mode 100644 tools/perf/tests/insn-x86-dat-64.c
>  create mode 100644 tools/perf/tests/insn-x86-dat-src.c
>  create mode 100644 tools/perf/tests/insn-x86.c
> 
> diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
> index c1518bdd0f1b..51fb737f82fc 100644
> --- a/tools/perf/tests/Build
> +++ b/tools/perf/tests/Build
> @@ -35,6 +35,9 @@ perf-y += thread-map.o
>  perf-y += llvm.o
> 
>  perf-$(CONFIG_X86) += perf-time-to-tsc.o
> +ifdef CONFIG_AUXTRACE
> +perf-$(CONFIG_X86) += insn-x86.o
> +endif
> 
>  ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
>  perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> index 136cd934be66..69a77f71d594 100644
> --- a/tools/perf/tests/builtin-test.c
> +++ b/tools/perf/tests/builtin-test.c
> @@ -178,6 +178,14 @@ static struct test {
>  		.desc = "Test LLVM searching and compiling",
>  		.func = test__llvm,
>  	},
> +#ifdef HAVE_AUXTRACE_SUPPORT
> +#if defined(__x86_64__) || defined(__i386__)
> +	{
> +		.desc = "Test x86 instruction decoder - new instructions",
> +		.func = test__insn_x86,
> +	},
> +#endif
> +#endif
>  	{
>  		.func = NULL,
>  	},
> diff --git a/tools/perf/tests/gen-insn-x86-dat.awk b/tools/perf/tests/gen-insn-x86-dat.awk
> new file mode 100644
> index 000000000000..a21454835cd4
> --- /dev/null
> +++ b/tools/perf/tests/gen-insn-x86-dat.awk
> @@ -0,0 +1,75 @@
> +#!/bin/awk -f
> +# gen-insn-x86-dat.awk: script to convert data for the insn-x86 test
> +# Copyright (c) 2015, Intel Corporation.
> +#
> +# This program is free software; you can redistribute it and/or modify it
> +# under the terms and conditions of the GNU General Public License,
> +# version 2, as published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope it will be useful, but WITHOUT
> +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +# more details.
> +
> +BEGIN {
> +	print "/*"
> +	print " * Generated by gen-insn-x86-dat.sh and gen-insn-x86-dat.awk"
> +	print " * from insn-x86-dat-src.c for inclusion by insn-x86.c"
> +	print " * Do not change this code."
> +	print "*/\n"
> +	op = ""
> +	branch = ""
> +	rel = 0
> +	going = 0
> +}
> +
> +/ Start here / {
> +	going = 1
> +}
> +
> +/ Stop here / {
> +	going = 0
> +}
> +
> +/^\s*[0-9a-fA-F]+\:/ {
> +	if (going) {
> +		colon_pos = index($0, ":")
> +		useful_line = substr($0, colon_pos + 1)
> +		first_pos = match(useful_line, "[0-9a-fA-F]")
> +		useful_line = substr(useful_line, first_pos)
> +		gsub("\t", "\\t", useful_line)
> +		printf "{{"
> +		len = 0
> +		for (i = 2; i <= NF; i++) {
> +			if (match($i, "^[0-9a-fA-F][0-9a-fA-F]$")) {
> +				printf "0x%s, ", $i
> +				len += 1
> +			} else {
> +				break
> +			}
> +		}
> +		printf "}, %d, %s, \"%s\", \"%s\",", len, rel, op, branch
> +		printf "\n\"%s\",},\n", useful_line
> +		op = ""
> +		branch = ""
> +		rel = 0
> +	}
> +}
> +
> +/ Expecting: / {
> +	expecting_str = " Expecting: "
> +	expecting_len = length(expecting_str)
> +	expecting_pos = index($0, expecting_str)
> +	useful_line = substr($0, expecting_pos + expecting_len)
> +	for (i = 1; i <= NF; i++) {
> +		if ($i == "Expecting:") {
> +			i++
> +			op = $i
> +			i++
> +			branch = $i
> +			i++
> +			rel = $i
> +			break
> +		}
> +	}
> +}
> diff --git a/tools/perf/tests/gen-insn-x86-dat.sh b/tools/perf/tests/gen-insn-x86-dat.sh
> new file mode 100755
> index 000000000000..2d4ef94cff98
> --- /dev/null
> +++ b/tools/perf/tests/gen-insn-x86-dat.sh
> @@ -0,0 +1,43 @@
> +#!/bin/sh
> +# gen-insn-x86-dat: generate data for the insn-x86 test
> +# Copyright (c) 2015, Intel Corporation.
> +#
> +# This program is free software; you can redistribute it and/or modify it
> +# under the terms and conditions of the GNU General Public License,
> +# version 2, as published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope it will be useful, but WITHOUT
> +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +# more details.
> +
> +set -e
> +
> +if [ "$(uname -m)" != "x86_64" ]; then
> +	echo "ERROR: This script only works on x86_64"
> +	exit 1
> +fi
> +
> +cd $(dirname $0)
> +
> +trap 'echo "Might need a more recent version of binutils"' EXIT
> +
> +echo "Compiling insn-x86-dat-src.c to 64-bit object"
> +
> +gcc -g -c insn-x86-dat-src.c
> +
> +objdump -dSw insn-x86-dat-src.o | awk -f gen-insn-x86-dat.awk > insn-x86-dat-64.c
> +
> +rm -f insn-x86-dat-src.o
> +
> +echo "Compiling insn-x86-dat-src.c to 32-bit object"
> +
> +gcc -g -c -m32 insn-x86-dat-src.c
> +
> +objdump -dSw insn-x86-dat-src.o | awk -f gen-insn-x86-dat.awk > insn-x86-dat-32.c
> +
> +rm -f insn-x86-dat-src.o
> +
> +trap - EXIT
> +
> +echo "Done (use git diff to see the changes)"
> diff --git a/tools/perf/tests/insn-x86-dat-32.c b/tools/perf/tests/insn-x86-dat-32.c
> new file mode 100644
> index 000000000000..6a38a34a5a49
> --- /dev/null
> +++ b/tools/perf/tests/insn-x86-dat-32.c
> @@ -0,0 +1,324 @@
> +/*
> + * Generated by gen-insn-x86-dat.sh and gen-insn-x86-dat.awk
> + * from insn-x86-dat-src.c for inclusion by insn-x86.c
> + * Do not change this code.
> +*/
> +
> +{{0x0f, 0x31, }, 2, 0, "", "",
> +"0f 31                \trdtsc  ",},
> +{{0xf3, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
> +"f3 0f 1b 00          \tbndmk  (%eax),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1b 05 78 56 34 12 \tbndmk  0x12345678,%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
> +"f3 0f 1b 18          \tbndmk  (%eax),%bnd3",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
> +"f3 0f 1b 04 01       \tbndmk  (%ecx,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 04 05 78 56 34 12 \tbndmk  0x12345678(,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
> +"f3 0f 1b 04 08       \tbndmk  (%eax,%ecx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
> +"f3 0f 1b 04 c8       \tbndmk  (%eax,%ecx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
> +"f3 0f 1b 40 12       \tbndmk  0x12(%eax),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
> +"f3 0f 1b 45 12       \tbndmk  0x12(%ebp),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 01 12    \tbndmk  0x12(%ecx,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 05 12    \tbndmk  0x12(%ebp,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 08 12    \tbndmk  0x12(%eax,%ecx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 c8 12    \tbndmk  0x12(%eax,%ecx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1b 80 78 56 34 12 \tbndmk  0x12345678(%eax),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1b 85 78 56 34 12 \tbndmk  0x12345678(%ebp),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 01 78 56 34 12 \tbndmk  0x12345678(%ecx,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 05 78 56 34 12 \tbndmk  0x12345678(%ebp,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 08 78 56 34 12 \tbndmk  0x12345678(%eax,%ecx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 c8 78 56 34 12 \tbndmk  0x12345678(%eax,%ecx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
> +"f3 0f 1a 00          \tbndcl  (%eax),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1a 05 78 56 34 12 \tbndcl  0x12345678,%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
> +"f3 0f 1a 18          \tbndcl  (%eax),%bnd3",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
> +"f3 0f 1a 04 01       \tbndcl  (%ecx,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 04 05 78 56 34 12 \tbndcl  0x12345678(,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
> +"f3 0f 1a 04 08       \tbndcl  (%eax,%ecx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
> +"f3 0f 1a 04 c8       \tbndcl  (%eax,%ecx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
> +"f3 0f 1a 40 12       \tbndcl  0x12(%eax),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
> +"f3 0f 1a 45 12       \tbndcl  0x12(%ebp),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 01 12    \tbndcl  0x12(%ecx,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 05 12    \tbndcl  0x12(%ebp,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 08 12    \tbndcl  0x12(%eax,%ecx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 c8 12    \tbndcl  0x12(%eax,%ecx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1a 80 78 56 34 12 \tbndcl  0x12345678(%eax),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1a 85 78 56 34 12 \tbndcl  0x12345678(%ebp),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 01 78 56 34 12 \tbndcl  0x12345678(%ecx,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 05 78 56 34 12 \tbndcl  0x12345678(%ebp,%eax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 08 78 56 34 12 \tbndcl  0x12345678(%eax,%ecx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 c8 78 56 34 12 \tbndcl  0x12345678(%eax,%ecx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
> +"f3 0f 1a c0          \tbndcl  %eax,%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
> +"f2 0f 1a 00          \tbndcu  (%eax),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1a 05 78 56 34 12 \tbndcu  0x12345678,%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
> +"f2 0f 1a 18          \tbndcu  (%eax),%bnd3",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
> +"f2 0f 1a 04 01       \tbndcu  (%ecx,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 04 05 78 56 34 12 \tbndcu  0x12345678(,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
> +"f2 0f 1a 04 08       \tbndcu  (%eax,%ecx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
> +"f2 0f 1a 04 c8       \tbndcu  (%eax,%ecx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
> +"f2 0f 1a 40 12       \tbndcu  0x12(%eax),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
> +"f2 0f 1a 45 12       \tbndcu  0x12(%ebp),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 01 12    \tbndcu  0x12(%ecx,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 05 12    \tbndcu  0x12(%ebp,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 08 12    \tbndcu  0x12(%eax,%ecx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 c8 12    \tbndcu  0x12(%eax,%ecx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1a 80 78 56 34 12 \tbndcu  0x12345678(%eax),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1a 85 78 56 34 12 \tbndcu  0x12345678(%ebp),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 01 78 56 34 12 \tbndcu  0x12345678(%ecx,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 05 78 56 34 12 \tbndcu  0x12345678(%ebp,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 08 78 56 34 12 \tbndcu  0x12345678(%eax,%ecx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 c8 78 56 34 12 \tbndcu  0x12345678(%eax,%ecx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
> +"f2 0f 1a c0          \tbndcu  %eax,%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
> +"f2 0f 1b 00          \tbndcn  (%eax),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1b 05 78 56 34 12 \tbndcn  0x12345678,%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
> +"f2 0f 1b 18          \tbndcn  (%eax),%bnd3",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
> +"f2 0f 1b 04 01       \tbndcn  (%ecx,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 04 05 78 56 34 12 \tbndcn  0x12345678(,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
> +"f2 0f 1b 04 08       \tbndcn  (%eax,%ecx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
> +"f2 0f 1b 04 c8       \tbndcn  (%eax,%ecx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
> +"f2 0f 1b 40 12       \tbndcn  0x12(%eax),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
> +"f2 0f 1b 45 12       \tbndcn  0x12(%ebp),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 01 12    \tbndcn  0x12(%ecx,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 05 12    \tbndcn  0x12(%ebp,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 08 12    \tbndcn  0x12(%eax,%ecx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 c8 12    \tbndcn  0x12(%eax,%ecx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1b 80 78 56 34 12 \tbndcn  0x12345678(%eax),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1b 85 78 56 34 12 \tbndcn  0x12345678(%ebp),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 01 78 56 34 12 \tbndcn  0x12345678(%ecx,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 05 78 56 34 12 \tbndcn  0x12345678(%ebp,%eax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 08 78 56 34 12 \tbndcn  0x12345678(%eax,%ecx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 c8 78 56 34 12 \tbndcn  0x12345678(%eax,%ecx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0xc0, }, 4, 0, "", "",
> +"f2 0f 1b c0          \tbndcn  %eax,%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
> +"66 0f 1a 00          \tbndmov (%eax),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1a 05 78 56 34 12 \tbndmov 0x12345678,%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
> +"66 0f 1a 18          \tbndmov (%eax),%bnd3",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
> +"66 0f 1a 04 01       \tbndmov (%ecx,%eax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 04 05 78 56 34 12 \tbndmov 0x12345678(,%eax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
> +"66 0f 1a 04 08       \tbndmov (%eax,%ecx,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
> +"66 0f 1a 04 c8       \tbndmov (%eax,%ecx,8),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
> +"66 0f 1a 40 12       \tbndmov 0x12(%eax),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
> +"66 0f 1a 45 12       \tbndmov 0x12(%ebp),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 01 12    \tbndmov 0x12(%ecx,%eax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 05 12    \tbndmov 0x12(%ebp,%eax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 08 12    \tbndmov 0x12(%eax,%ecx,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 c8 12    \tbndmov 0x12(%eax,%ecx,8),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1a 80 78 56 34 12 \tbndmov 0x12345678(%eax),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1a 85 78 56 34 12 \tbndmov 0x12345678(%ebp),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 01 78 56 34 12 \tbndmov 0x12345678(%ecx,%eax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 05 78 56 34 12 \tbndmov 0x12345678(%ebp,%eax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 08 78 56 34 12 \tbndmov 0x12345678(%eax,%ecx,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 c8 78 56 34 12 \tbndmov 0x12345678(%eax,%ecx,8),%bnd0",},
> +{{0x66, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
> +"66 0f 1b 00          \tbndmov %bnd0,(%eax)",},
> +{{0x66, 0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1b 05 78 56 34 12 \tbndmov %bnd0,0x12345678",},
> +{{0x66, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
> +"66 0f 1b 18          \tbndmov %bnd3,(%eax)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
> +"66 0f 1b 04 01       \tbndmov %bnd0,(%ecx,%eax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 04 05 78 56 34 12 \tbndmov %bnd0,0x12345678(,%eax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
> +"66 0f 1b 04 08       \tbndmov %bnd0,(%eax,%ecx,1)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
> +"66 0f 1b 04 c8       \tbndmov %bnd0,(%eax,%ecx,8)",},
> +{{0x66, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
> +"66 0f 1b 40 12       \tbndmov %bnd0,0x12(%eax)",},
> +{{0x66, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
> +"66 0f 1b 45 12       \tbndmov %bnd0,0x12(%ebp)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 01 12    \tbndmov %bnd0,0x12(%ecx,%eax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 05 12    \tbndmov %bnd0,0x12(%ebp,%eax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 08 12    \tbndmov %bnd0,0x12(%eax,%ecx,1)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 c8 12    \tbndmov %bnd0,0x12(%eax,%ecx,8)",},
> +{{0x66, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1b 80 78 56 34 12 \tbndmov %bnd0,0x12345678(%eax)",},
> +{{0x66, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1b 85 78 56 34 12 \tbndmov %bnd0,0x12345678(%ebp)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 01 78 56 34 12 \tbndmov %bnd0,0x12345678(%ecx,%eax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 05 78 56 34 12 \tbndmov %bnd0,0x12345678(%ebp,%eax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 08 78 56 34 12 \tbndmov %bnd0,0x12345678(%eax,%ecx,1)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 c8 78 56 34 12 \tbndmov %bnd0,0x12345678(%eax,%ecx,8)",},
> +{{0x66, 0x0f, 0x1a, 0xc8, }, 4, 0, "", "",
> +"66 0f 1a c8          \tbndmov %bnd0,%bnd1",},
> +{{0x66, 0x0f, 0x1a, 0xc1, }, 4, 0, "", "",
> +"66 0f 1a c1          \tbndmov %bnd1,%bnd0",},
> +{{0x0f, 0x1a, 0x00, }, 3, 0, "", "",
> +"0f 1a 00             \tbndldx (%eax),%bnd0",},
> +{{0x0f, 0x1a, 0x05, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1a 05 78 56 34 12 \tbndldx 0x12345678,%bnd0",},
> +{{0x0f, 0x1a, 0x18, }, 3, 0, "", "",
> +"0f 1a 18             \tbndldx (%eax),%bnd3",},
> +{{0x0f, 0x1a, 0x04, 0x01, }, 4, 0, "", "",
> +"0f 1a 04 01          \tbndldx (%ecx,%eax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 04 05 78 56 34 12 \tbndldx 0x12345678(,%eax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x04, 0x08, }, 4, 0, "", "",
> +"0f 1a 04 08          \tbndldx (%eax,%ecx,1),%bnd0",},
> +{{0x0f, 0x1a, 0x40, 0x12, }, 4, 0, "", "",
> +"0f 1a 40 12          \tbndldx 0x12(%eax),%bnd0",},
> +{{0x0f, 0x1a, 0x45, 0x12, }, 4, 0, "", "",
> +"0f 1a 45 12          \tbndldx 0x12(%ebp),%bnd0",},
> +{{0x0f, 0x1a, 0x44, 0x01, 0x12, }, 5, 0, "", "",
> +"0f 1a 44 01 12       \tbndldx 0x12(%ecx,%eax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x44, 0x05, 0x12, }, 5, 0, "", "",
> +"0f 1a 44 05 12       \tbndldx 0x12(%ebp,%eax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x44, 0x08, 0x12, }, 5, 0, "", "",
> +"0f 1a 44 08 12       \tbndldx 0x12(%eax,%ecx,1),%bnd0",},
> +{{0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1a 80 78 56 34 12 \tbndldx 0x12345678(%eax),%bnd0",},
> +{{0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1a 85 78 56 34 12 \tbndldx 0x12345678(%ebp),%bnd0",},
> +{{0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 84 01 78 56 34 12 \tbndldx 0x12345678(%ecx,%eax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 84 05 78 56 34 12 \tbndldx 0x12345678(%ebp,%eax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 84 08 78 56 34 12 \tbndldx 0x12345678(%eax,%ecx,1),%bnd0",},
> +{{0x0f, 0x1b, 0x00, }, 3, 0, "", "",
> +"0f 1b 00             \tbndstx %bnd0,(%eax)",},
> +{{0x0f, 0x1b, 0x05, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1b 05 78 56 34 12 \tbndstx %bnd0,0x12345678",},
> +{{0x0f, 0x1b, 0x18, }, 3, 0, "", "",
> +"0f 1b 18             \tbndstx %bnd3,(%eax)",},
> +{{0x0f, 0x1b, 0x04, 0x01, }, 4, 0, "", "",
> +"0f 1b 04 01          \tbndstx %bnd0,(%ecx,%eax,1)",},
> +{{0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 04 05 78 56 34 12 \tbndstx %bnd0,0x12345678(,%eax,1)",},
> +{{0x0f, 0x1b, 0x04, 0x08, }, 4, 0, "", "",
> +"0f 1b 04 08          \tbndstx %bnd0,(%eax,%ecx,1)",},
> +{{0x0f, 0x1b, 0x40, 0x12, }, 4, 0, "", "",
> +"0f 1b 40 12          \tbndstx %bnd0,0x12(%eax)",},
> +{{0x0f, 0x1b, 0x45, 0x12, }, 4, 0, "", "",
> +"0f 1b 45 12          \tbndstx %bnd0,0x12(%ebp)",},
> +{{0x0f, 0x1b, 0x44, 0x01, 0x12, }, 5, 0, "", "",
> +"0f 1b 44 01 12       \tbndstx %bnd0,0x12(%ecx,%eax,1)",},
> +{{0x0f, 0x1b, 0x44, 0x05, 0x12, }, 5, 0, "", "",
> +"0f 1b 44 05 12       \tbndstx %bnd0,0x12(%ebp,%eax,1)",},
> +{{0x0f, 0x1b, 0x44, 0x08, 0x12, }, 5, 0, "", "",
> +"0f 1b 44 08 12       \tbndstx %bnd0,0x12(%eax,%ecx,1)",},
> +{{0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1b 80 78 56 34 12 \tbndstx %bnd0,0x12345678(%eax)",},
> +{{0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1b 85 78 56 34 12 \tbndstx %bnd0,0x12345678(%ebp)",},
> +{{0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 84 01 78 56 34 12 \tbndstx %bnd0,0x12345678(%ecx,%eax,1)",},
> +{{0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 84 05 78 56 34 12 \tbndstx %bnd0,0x12345678(%ebp,%eax,1)",},
> +{{0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 84 08 78 56 34 12 \tbndstx %bnd0,0x12345678(%eax,%ecx,1)",},
> +{{0xf2, 0xe8, 0xfc, 0xff, 0xff, 0xff, }, 6, 0xfffffffc, "call", "unconditional",
> +"f2 e8 fc ff ff ff    \tbnd call 3c3 <main+0x3c3>",},
> +{{0xf2, 0xff, 0x10, }, 3, 0, "call", "indirect",
> +"f2 ff 10             \tbnd call *(%eax)",},
> +{{0xf2, 0xc3, }, 2, 0, "ret", "indirect",
> +"f2 c3                \tbnd ret ",},
> +{{0xf2, 0xe9, 0xfc, 0xff, 0xff, 0xff, }, 6, 0xfffffffc, "jmp", "unconditional",
> +"f2 e9 fc ff ff ff    \tbnd jmp 3ce <main+0x3ce>",},
> +{{0xf2, 0xe9, 0xfc, 0xff, 0xff, 0xff, }, 6, 0xfffffffc, "jmp", "unconditional",
> +"f2 e9 fc ff ff ff    \tbnd jmp 3d4 <main+0x3d4>",},
> +{{0xf2, 0xff, 0x21, }, 3, 0, "jmp", "indirect",
> +"f2 ff 21             \tbnd jmp *(%ecx)",},
> +{{0xf2, 0x0f, 0x85, 0xfc, 0xff, 0xff, 0xff, }, 7, 0xfffffffc, "jcc", "conditional",
> +"f2 0f 85 fc ff ff ff \tbnd jne 3de <main+0x3de>",},
> diff --git a/tools/perf/tests/insn-x86-dat-64.c b/tools/perf/tests/insn-x86-dat-64.c
> new file mode 100644
> index 000000000000..01122421a776
> --- /dev/null
> +++ b/tools/perf/tests/insn-x86-dat-64.c
> @@ -0,0 +1,340 @@
> +/*
> + * Generated by gen-insn-x86-dat.sh and gen-insn-x86-dat.awk
> + * from insn-x86-dat-src.c for inclusion by insn-x86.c
> + * Do not change this code.
> +*/
> +
> +{{0x0f, 0x31, }, 2, 0, "", "",
> +"0f 31                \trdtsc  ",},
> +{{0xf3, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
> +"f3 0f 1b 00          \tbndmk  (%rax),%bnd0",},
> +{{0xf3, 0x41, 0x0f, 0x1b, 0x00, }, 5, 0, "", "",
> +"f3 41 0f 1b 00       \tbndmk  (%r8),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 04 25 78 56 34 12 \tbndmk  0x12345678,%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
> +"f3 0f 1b 18          \tbndmk  (%rax),%bnd3",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
> +"f3 0f 1b 04 01       \tbndmk  (%rcx,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 04 05 78 56 34 12 \tbndmk  0x12345678(,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
> +"f3 0f 1b 04 08       \tbndmk  (%rax,%rcx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
> +"f3 0f 1b 04 c8       \tbndmk  (%rax,%rcx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
> +"f3 0f 1b 40 12       \tbndmk  0x12(%rax),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
> +"f3 0f 1b 45 12       \tbndmk  0x12(%rbp),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 01 12    \tbndmk  0x12(%rcx,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 05 12    \tbndmk  0x12(%rbp,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 08 12    \tbndmk  0x12(%rax,%rcx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f3 0f 1b 44 c8 12    \tbndmk  0x12(%rax,%rcx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1b 80 78 56 34 12 \tbndmk  0x12345678(%rax),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1b 85 78 56 34 12 \tbndmk  0x12345678(%rbp),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 01 78 56 34 12 \tbndmk  0x12345678(%rcx,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 05 78 56 34 12 \tbndmk  0x12345678(%rbp,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 08 78 56 34 12 \tbndmk  0x12345678(%rax,%rcx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1b 84 c8 78 56 34 12 \tbndmk  0x12345678(%rax,%rcx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
> +"f3 0f 1a 00          \tbndcl  (%rax),%bnd0",},
> +{{0xf3, 0x41, 0x0f, 0x1a, 0x00, }, 5, 0, "", "",
> +"f3 41 0f 1a 00       \tbndcl  (%r8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 04 25 78 56 34 12 \tbndcl  0x12345678,%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
> +"f3 0f 1a 18          \tbndcl  (%rax),%bnd3",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
> +"f3 0f 1a 04 01       \tbndcl  (%rcx,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 04 05 78 56 34 12 \tbndcl  0x12345678(,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
> +"f3 0f 1a 04 08       \tbndcl  (%rax,%rcx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
> +"f3 0f 1a 04 c8       \tbndcl  (%rax,%rcx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
> +"f3 0f 1a 40 12       \tbndcl  0x12(%rax),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
> +"f3 0f 1a 45 12       \tbndcl  0x12(%rbp),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 01 12    \tbndcl  0x12(%rcx,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 05 12    \tbndcl  0x12(%rbp,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 08 12    \tbndcl  0x12(%rax,%rcx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f3 0f 1a 44 c8 12    \tbndcl  0x12(%rax,%rcx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1a 80 78 56 34 12 \tbndcl  0x12345678(%rax),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f3 0f 1a 85 78 56 34 12 \tbndcl  0x12345678(%rbp),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 01 78 56 34 12 \tbndcl  0x12345678(%rcx,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 05 78 56 34 12 \tbndcl  0x12345678(%rbp,%rax,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 08 78 56 34 12 \tbndcl  0x12345678(%rax,%rcx,1),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f3 0f 1a 84 c8 78 56 34 12 \tbndcl  0x12345678(%rax,%rcx,8),%bnd0",},
> +{{0xf3, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
> +"f3 0f 1a c0          \tbndcl  %rax,%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
> +"f2 0f 1a 00          \tbndcu  (%rax),%bnd0",},
> +{{0xf2, 0x41, 0x0f, 0x1a, 0x00, }, 5, 0, "", "",
> +"f2 41 0f 1a 00       \tbndcu  (%r8),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 04 25 78 56 34 12 \tbndcu  0x12345678,%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
> +"f2 0f 1a 18          \tbndcu  (%rax),%bnd3",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
> +"f2 0f 1a 04 01       \tbndcu  (%rcx,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 04 05 78 56 34 12 \tbndcu  0x12345678(,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
> +"f2 0f 1a 04 08       \tbndcu  (%rax,%rcx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
> +"f2 0f 1a 04 c8       \tbndcu  (%rax,%rcx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
> +"f2 0f 1a 40 12       \tbndcu  0x12(%rax),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
> +"f2 0f 1a 45 12       \tbndcu  0x12(%rbp),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 01 12    \tbndcu  0x12(%rcx,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 05 12    \tbndcu  0x12(%rbp,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 08 12    \tbndcu  0x12(%rax,%rcx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f2 0f 1a 44 c8 12    \tbndcu  0x12(%rax,%rcx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1a 80 78 56 34 12 \tbndcu  0x12345678(%rax),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1a 85 78 56 34 12 \tbndcu  0x12345678(%rbp),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 01 78 56 34 12 \tbndcu  0x12345678(%rcx,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 05 78 56 34 12 \tbndcu  0x12345678(%rbp,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 08 78 56 34 12 \tbndcu  0x12345678(%rax,%rcx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1a 84 c8 78 56 34 12 \tbndcu  0x12345678(%rax,%rcx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1a, 0xc0, }, 4, 0, "", "",
> +"f2 0f 1a c0          \tbndcu  %rax,%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
> +"f2 0f 1b 00          \tbndcn  (%rax),%bnd0",},
> +{{0xf2, 0x41, 0x0f, 0x1b, 0x00, }, 5, 0, "", "",
> +"f2 41 0f 1b 00       \tbndcn  (%r8),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 04 25 78 56 34 12 \tbndcn  0x12345678,%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
> +"f2 0f 1b 18          \tbndcn  (%rax),%bnd3",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
> +"f2 0f 1b 04 01       \tbndcn  (%rcx,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 04 05 78 56 34 12 \tbndcn  0x12345678(,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
> +"f2 0f 1b 04 08       \tbndcn  (%rax,%rcx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
> +"f2 0f 1b 04 c8       \tbndcn  (%rax,%rcx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
> +"f2 0f 1b 40 12       \tbndcn  0x12(%rax),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
> +"f2 0f 1b 45 12       \tbndcn  0x12(%rbp),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 01 12    \tbndcn  0x12(%rcx,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 05 12    \tbndcn  0x12(%rbp,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 08 12    \tbndcn  0x12(%rax,%rcx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"f2 0f 1b 44 c8 12    \tbndcn  0x12(%rax,%rcx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1b 80 78 56 34 12 \tbndcn  0x12345678(%rax),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"f2 0f 1b 85 78 56 34 12 \tbndcn  0x12345678(%rbp),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 01 78 56 34 12 \tbndcn  0x12345678(%rcx,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 05 78 56 34 12 \tbndcn  0x12345678(%rbp,%rax,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 08 78 56 34 12 \tbndcn  0x12345678(%rax,%rcx,1),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"f2 0f 1b 84 c8 78 56 34 12 \tbndcn  0x12345678(%rax,%rcx,8),%bnd0",},
> +{{0xf2, 0x0f, 0x1b, 0xc0, }, 4, 0, "", "",
> +"f2 0f 1b c0          \tbndcn  %rax,%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
> +"66 0f 1a 00          \tbndmov (%rax),%bnd0",},
> +{{0x66, 0x41, 0x0f, 0x1a, 0x00, }, 5, 0, "", "",
> +"66 41 0f 1a 00       \tbndmov (%r8),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 04 25 78 56 34 12 \tbndmov 0x12345678,%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x18, }, 4, 0, "", "",
> +"66 0f 1a 18          \tbndmov (%rax),%bnd3",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0x01, }, 5, 0, "", "",
> +"66 0f 1a 04 01       \tbndmov (%rcx,%rax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 04 05 78 56 34 12 \tbndmov 0x12345678(,%rax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0x08, }, 5, 0, "", "",
> +"66 0f 1a 04 08       \tbndmov (%rax,%rcx,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x04, 0xc8, }, 5, 0, "", "",
> +"66 0f 1a 04 c8       \tbndmov (%rax,%rcx,8),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x40, 0x12, }, 5, 0, "", "",
> +"66 0f 1a 40 12       \tbndmov 0x12(%rax),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x45, 0x12, }, 5, 0, "", "",
> +"66 0f 1a 45 12       \tbndmov 0x12(%rbp),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 01 12    \tbndmov 0x12(%rcx,%rax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 05 12    \tbndmov 0x12(%rbp,%rax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 08 12    \tbndmov 0x12(%rax,%rcx,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"66 0f 1a 44 c8 12    \tbndmov 0x12(%rax,%rcx,8),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1a 80 78 56 34 12 \tbndmov 0x12345678(%rax),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1a 85 78 56 34 12 \tbndmov 0x12345678(%rbp),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 01 78 56 34 12 \tbndmov 0x12345678(%rcx,%rax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 05 78 56 34 12 \tbndmov 0x12345678(%rbp,%rax,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 08 78 56 34 12 \tbndmov 0x12345678(%rax,%rcx,1),%bnd0",},
> +{{0x66, 0x0f, 0x1a, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1a 84 c8 78 56 34 12 \tbndmov 0x12345678(%rax,%rcx,8),%bnd0",},
> +{{0x66, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
> +"66 0f 1b 00          \tbndmov %bnd0,(%rax)",},
> +{{0x66, 0x41, 0x0f, 0x1b, 0x00, }, 5, 0, "", "",
> +"66 41 0f 1b 00       \tbndmov %bnd0,(%r8)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 04 25 78 56 34 12 \tbndmov %bnd0,0x12345678",},
> +{{0x66, 0x0f, 0x1b, 0x18, }, 4, 0, "", "",
> +"66 0f 1b 18          \tbndmov %bnd3,(%rax)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0x01, }, 5, 0, "", "",
> +"66 0f 1b 04 01       \tbndmov %bnd0,(%rcx,%rax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 04 05 78 56 34 12 \tbndmov %bnd0,0x12345678(,%rax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0x08, }, 5, 0, "", "",
> +"66 0f 1b 04 08       \tbndmov %bnd0,(%rax,%rcx,1)",},
> +{{0x66, 0x0f, 0x1b, 0x04, 0xc8, }, 5, 0, "", "",
> +"66 0f 1b 04 c8       \tbndmov %bnd0,(%rax,%rcx,8)",},
> +{{0x66, 0x0f, 0x1b, 0x40, 0x12, }, 5, 0, "", "",
> +"66 0f 1b 40 12       \tbndmov %bnd0,0x12(%rax)",},
> +{{0x66, 0x0f, 0x1b, 0x45, 0x12, }, 5, 0, "", "",
> +"66 0f 1b 45 12       \tbndmov %bnd0,0x12(%rbp)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0x01, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 01 12    \tbndmov %bnd0,0x12(%rcx,%rax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0x05, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 05 12    \tbndmov %bnd0,0x12(%rbp,%rax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0x08, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 08 12    \tbndmov %bnd0,0x12(%rax,%rcx,1)",},
> +{{0x66, 0x0f, 0x1b, 0x44, 0xc8, 0x12, }, 6, 0, "", "",
> +"66 0f 1b 44 c8 12    \tbndmov %bnd0,0x12(%rax,%rcx,8)",},
> +{{0x66, 0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1b 80 78 56 34 12 \tbndmov %bnd0,0x12345678(%rax)",},
> +{{0x66, 0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"66 0f 1b 85 78 56 34 12 \tbndmov %bnd0,0x12345678(%rbp)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 01 78 56 34 12 \tbndmov %bnd0,0x12345678(%rcx,%rax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 05 78 56 34 12 \tbndmov %bnd0,0x12345678(%rbp,%rax,1)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 08 78 56 34 12 \tbndmov %bnd0,0x12345678(%rax,%rcx,1)",},
> +{{0x66, 0x0f, 0x1b, 0x84, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "",
> +"66 0f 1b 84 c8 78 56 34 12 \tbndmov %bnd0,0x12345678(%rax,%rcx,8)",},
> +{{0x66, 0x0f, 0x1a, 0xc8, }, 4, 0, "", "",
> +"66 0f 1a c8          \tbndmov %bnd0,%bnd1",},
> +{{0x66, 0x0f, 0x1a, 0xc1, }, 4, 0, "", "",
> +"66 0f 1a c1          \tbndmov %bnd1,%bnd0",},
> +{{0x0f, 0x1a, 0x00, }, 3, 0, "", "",
> +"0f 1a 00             \tbndldx (%rax),%bnd0",},
> +{{0x41, 0x0f, 0x1a, 0x00, }, 4, 0, "", "",
> +"41 0f 1a 00          \tbndldx (%r8),%bnd0",},
> +{{0x0f, 0x1a, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 04 25 78 56 34 12 \tbndldx 0x12345678,%bnd0",},
> +{{0x0f, 0x1a, 0x18, }, 3, 0, "", "",
> +"0f 1a 18             \tbndldx (%rax),%bnd3",},
> +{{0x0f, 0x1a, 0x04, 0x01, }, 4, 0, "", "",
> +"0f 1a 04 01          \tbndldx (%rcx,%rax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 04 05 78 56 34 12 \tbndldx 0x12345678(,%rax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x04, 0x08, }, 4, 0, "", "",
> +"0f 1a 04 08          \tbndldx (%rax,%rcx,1),%bnd0",},
> +{{0x0f, 0x1a, 0x40, 0x12, }, 4, 0, "", "",
> +"0f 1a 40 12          \tbndldx 0x12(%rax),%bnd0",},
> +{{0x0f, 0x1a, 0x45, 0x12, }, 4, 0, "", "",
> +"0f 1a 45 12          \tbndldx 0x12(%rbp),%bnd0",},
> +{{0x0f, 0x1a, 0x44, 0x01, 0x12, }, 5, 0, "", "",
> +"0f 1a 44 01 12       \tbndldx 0x12(%rcx,%rax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x44, 0x05, 0x12, }, 5, 0, "", "",
> +"0f 1a 44 05 12       \tbndldx 0x12(%rbp,%rax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x44, 0x08, 0x12, }, 5, 0, "", "",
> +"0f 1a 44 08 12       \tbndldx 0x12(%rax,%rcx,1),%bnd0",},
> +{{0x0f, 0x1a, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1a 80 78 56 34 12 \tbndldx 0x12345678(%rax),%bnd0",},
> +{{0x0f, 0x1a, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1a 85 78 56 34 12 \tbndldx 0x12345678(%rbp),%bnd0",},
> +{{0x0f, 0x1a, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 84 01 78 56 34 12 \tbndldx 0x12345678(%rcx,%rax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 84 05 78 56 34 12 \tbndldx 0x12345678(%rbp,%rax,1),%bnd0",},
> +{{0x0f, 0x1a, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1a 84 08 78 56 34 12 \tbndldx 0x12345678(%rax,%rcx,1),%bnd0",},
> +{{0x0f, 0x1b, 0x00, }, 3, 0, "", "",
> +"0f 1b 00             \tbndstx %bnd0,(%rax)",},
> +{{0x41, 0x0f, 0x1b, 0x00, }, 4, 0, "", "",
> +"41 0f 1b 00          \tbndstx %bnd0,(%r8)",},
> +{{0x0f, 0x1b, 0x04, 0x25, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 04 25 78 56 34 12 \tbndstx %bnd0,0x12345678",},
> +{{0x0f, 0x1b, 0x18, }, 3, 0, "", "",
> +"0f 1b 18             \tbndstx %bnd3,(%rax)",},
> +{{0x0f, 0x1b, 0x04, 0x01, }, 4, 0, "", "",
> +"0f 1b 04 01          \tbndstx %bnd0,(%rcx,%rax,1)",},
> +{{0x0f, 0x1b, 0x04, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 04 05 78 56 34 12 \tbndstx %bnd0,0x12345678(,%rax,1)",},
> +{{0x0f, 0x1b, 0x04, 0x08, }, 4, 0, "", "",
> +"0f 1b 04 08          \tbndstx %bnd0,(%rax,%rcx,1)",},
> +{{0x0f, 0x1b, 0x40, 0x12, }, 4, 0, "", "",
> +"0f 1b 40 12          \tbndstx %bnd0,0x12(%rax)",},
> +{{0x0f, 0x1b, 0x45, 0x12, }, 4, 0, "", "",
> +"0f 1b 45 12          \tbndstx %bnd0,0x12(%rbp)",},
> +{{0x0f, 0x1b, 0x44, 0x01, 0x12, }, 5, 0, "", "",
> +"0f 1b 44 01 12       \tbndstx %bnd0,0x12(%rcx,%rax,1)",},
> +{{0x0f, 0x1b, 0x44, 0x05, 0x12, }, 5, 0, "", "",
> +"0f 1b 44 05 12       \tbndstx %bnd0,0x12(%rbp,%rax,1)",},
> +{{0x0f, 0x1b, 0x44, 0x08, 0x12, }, 5, 0, "", "",
> +"0f 1b 44 08 12       \tbndstx %bnd0,0x12(%rax,%rcx,1)",},
> +{{0x0f, 0x1b, 0x80, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1b 80 78 56 34 12 \tbndstx %bnd0,0x12345678(%rax)",},
> +{{0x0f, 0x1b, 0x85, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "",
> +"0f 1b 85 78 56 34 12 \tbndstx %bnd0,0x12345678(%rbp)",},
> +{{0x0f, 0x1b, 0x84, 0x01, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 84 01 78 56 34 12 \tbndstx %bnd0,0x12345678(%rcx,%rax,1)",},
> +{{0x0f, 0x1b, 0x84, 0x05, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 84 05 78 56 34 12 \tbndstx %bnd0,0x12345678(%rbp,%rax,1)",},
> +{{0x0f, 0x1b, 0x84, 0x08, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "",
> +"0f 1b 84 08 78 56 34 12 \tbndstx %bnd0,0x12345678(%rax,%rcx,1)",},
> +{{0xf2, 0xe8, 0x00, 0x00, 0x00, 0x00, }, 6, 0, "call", "unconditional",
> +"f2 e8 00 00 00 00    \tbnd callq 3f6 <main+0x3f6>",},
> +{{0x67, 0xf2, 0xff, 0x10, }, 4, 0, "call", "indirect",
> +"67 f2 ff 10          \tbnd callq *(%eax)",},
> +{{0xf2, 0xc3, }, 2, 0, "ret", "indirect",
> +"f2 c3                \tbnd retq ",},
> +{{0xf2, 0xe9, 0x00, 0x00, 0x00, 0x00, }, 6, 0, "jmp", "unconditional",
> +"f2 e9 00 00 00 00    \tbnd jmpq 402 <main+0x402>",},
> +{{0xf2, 0xe9, 0x00, 0x00, 0x00, 0x00, }, 6, 0, "jmp", "unconditional",
> +"f2 e9 00 00 00 00    \tbnd jmpq 408 <main+0x408>",},
> +{{0x67, 0xf2, 0xff, 0x21, }, 4, 0, "jmp", "indirect",
> +"67 f2 ff 21          \tbnd jmpq *(%ecx)",},
> +{{0xf2, 0x0f, 0x85, 0x00, 0x00, 0x00, 0x00, }, 7, 0, "jcc", "conditional",
> +"f2 0f 85 00 00 00 00 \tbnd jne 413 <main+0x413>",},
> diff --git a/tools/perf/tests/insn-x86-dat-src.c b/tools/perf/tests/insn-x86-dat-src.c
> new file mode 100644
> index 000000000000..b506830f33a8
> --- /dev/null
> +++ b/tools/perf/tests/insn-x86-dat-src.c
> @@ -0,0 +1,416 @@
> +/*
> + * This file contains instructions for testing by the test titled:
> + *
> + *         "Test x86 instruction decoder - new instructions"
> + *
> + * Note that the 'Expecting' comment lines are consumed by the
> + * gen-insn-x86-dat.awk script and have the format:
> + *
> + *         Expecting: <op> <branch> <rel>
> + *
> + * If this file is changed, remember to run the gen-insn-x86-dat.sh
> + * script and commit the result.
> + *
> + * Refer to insn-x86.c for more details.
> + */
> +
> +int main(void)
> +{
> +	/* Following line is a marker for the awk script - do not change */
> +	asm volatile("rdtsc"); /* Start here */
> +
> +#ifdef __x86_64__
> +
> +	/* bndmk m64, bnd */
> +
> +	asm volatile("bndmk (%rax), %bnd0");
> +	asm volatile("bndmk (%r8), %bnd0");
> +	asm volatile("bndmk (0x12345678), %bnd0");
> +	asm volatile("bndmk (%rax), %bnd3");
> +	asm volatile("bndmk (%rcx,%rax,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(,%rax,1), %bnd0");
> +	asm volatile("bndmk (%rax,%rcx,1), %bnd0");
> +	asm volatile("bndmk (%rax,%rcx,8), %bnd0");
> +	asm volatile("bndmk 0x12(%rax), %bnd0");
> +	asm volatile("bndmk 0x12(%rbp), %bnd0");
> +	asm volatile("bndmk 0x12(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndmk 0x12(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndmk 0x12(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndmk 0x12(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndmk 0x12345678(%rax), %bnd0");
> +	asm volatile("bndmk 0x12345678(%rbp), %bnd0");
> +	asm volatile("bndmk 0x12345678(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(%rax,%rcx,8), %bnd0");
> +
> +	/* bndcl r/m64, bnd */
> +
> +	asm volatile("bndcl (%rax), %bnd0");
> +	asm volatile("bndcl (%r8), %bnd0");
> +	asm volatile("bndcl (0x12345678), %bnd0");
> +	asm volatile("bndcl (%rax), %bnd3");
> +	asm volatile("bndcl (%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(,%rax,1), %bnd0");
> +	asm volatile("bndcl (%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcl (%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcl 0x12(%rax), %bnd0");
> +	asm volatile("bndcl 0x12(%rbp), %bnd0");
> +	asm volatile("bndcl 0x12(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcl 0x12(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndcl 0x12(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcl 0x12(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcl 0x12345678(%rax), %bnd0");
> +	asm volatile("bndcl 0x12345678(%rbp), %bnd0");
> +	asm volatile("bndcl 0x12345678(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcl %rax, %bnd0");
> +
> +	/* bndcu r/m64, bnd */
> +
> +	asm volatile("bndcu (%rax), %bnd0");
> +	asm volatile("bndcu (%r8), %bnd0");
> +	asm volatile("bndcu (0x12345678), %bnd0");
> +	asm volatile("bndcu (%rax), %bnd3");
> +	asm volatile("bndcu (%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(,%rax,1), %bnd0");
> +	asm volatile("bndcu (%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcu (%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcu 0x12(%rax), %bnd0");
> +	asm volatile("bndcu 0x12(%rbp), %bnd0");
> +	asm volatile("bndcu 0x12(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcu 0x12(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndcu 0x12(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcu 0x12(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcu 0x12345678(%rax), %bnd0");
> +	asm volatile("bndcu 0x12345678(%rbp), %bnd0");
> +	asm volatile("bndcu 0x12345678(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcu %rax, %bnd0");
> +
> +	/* bndcn r/m64, bnd */
> +
> +	asm volatile("bndcn (%rax), %bnd0");
> +	asm volatile("bndcn (%r8), %bnd0");
> +	asm volatile("bndcn (0x12345678), %bnd0");
> +	asm volatile("bndcn (%rax), %bnd3");
> +	asm volatile("bndcn (%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(,%rax,1), %bnd0");
> +	asm volatile("bndcn (%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcn (%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcn 0x12(%rax), %bnd0");
> +	asm volatile("bndcn 0x12(%rbp), %bnd0");
> +	asm volatile("bndcn 0x12(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcn 0x12(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndcn 0x12(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcn 0x12(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcn 0x12345678(%rax), %bnd0");
> +	asm volatile("bndcn 0x12345678(%rbp), %bnd0");
> +	asm volatile("bndcn 0x12345678(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndcn %rax, %bnd0");
> +
> +	/* bndmov m128, bnd */
> +
> +	asm volatile("bndmov (%rax), %bnd0");
> +	asm volatile("bndmov (%r8), %bnd0");
> +	asm volatile("bndmov (0x12345678), %bnd0");
> +	asm volatile("bndmov (%rax), %bnd3");
> +	asm volatile("bndmov (%rcx,%rax,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(,%rax,1), %bnd0");
> +	asm volatile("bndmov (%rax,%rcx,1), %bnd0");
> +	asm volatile("bndmov (%rax,%rcx,8), %bnd0");
> +	asm volatile("bndmov 0x12(%rax), %bnd0");
> +	asm volatile("bndmov 0x12(%rbp), %bnd0");
> +	asm volatile("bndmov 0x12(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndmov 0x12(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndmov 0x12(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndmov 0x12(%rax,%rcx,8), %bnd0");
> +	asm volatile("bndmov 0x12345678(%rax), %bnd0");
> +	asm volatile("bndmov 0x12345678(%rbp), %bnd0");
> +	asm volatile("bndmov 0x12345678(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(%rax,%rcx,8), %bnd0");
> +
> +	/* bndmov bnd, m128 */
> +
> +	asm volatile("bndmov %bnd0, (%rax)");
> +	asm volatile("bndmov %bnd0, (%r8)");
> +	asm volatile("bndmov %bnd0, (0x12345678)");
> +	asm volatile("bndmov %bnd3, (%rax)");
> +	asm volatile("bndmov %bnd0, (%rcx,%rax,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(,%rax,1)");
> +	asm volatile("bndmov %bnd0, (%rax,%rcx,1)");
> +	asm volatile("bndmov %bnd0, (%rax,%rcx,8)");
> +	asm volatile("bndmov %bnd0, 0x12(%rax)");
> +	asm volatile("bndmov %bnd0, 0x12(%rbp)");
> +	asm volatile("bndmov %bnd0, 0x12(%rcx,%rax,1)");
> +	asm volatile("bndmov %bnd0, 0x12(%rbp,%rax,1)");
> +	asm volatile("bndmov %bnd0, 0x12(%rax,%rcx,1)");
> +	asm volatile("bndmov %bnd0, 0x12(%rax,%rcx,8)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%rax)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%rbp)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%rcx,%rax,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%rbp,%rax,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%rax,%rcx,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%rax,%rcx,8)");
> +
> +	/* bndmov bnd2, bnd1 */
> +
> +	asm volatile("bndmov %bnd0, %bnd1");
> +	asm volatile("bndmov %bnd1, %bnd0");
> +
> +	/* bndldx mib, bnd */
> +
> +	asm volatile("bndldx (%rax), %bnd0");
> +	asm volatile("bndldx (%r8), %bnd0");
> +	asm volatile("bndldx (0x12345678), %bnd0");
> +	asm volatile("bndldx (%rax), %bnd3");
> +	asm volatile("bndldx (%rcx,%rax,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(,%rax,1), %bnd0");
> +	asm volatile("bndldx (%rax,%rcx,1), %bnd0");
> +	asm volatile("bndldx 0x12(%rax), %bnd0");
> +	asm volatile("bndldx 0x12(%rbp), %bnd0");
> +	asm volatile("bndldx 0x12(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndldx 0x12(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndldx 0x12(%rax,%rcx,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(%rax), %bnd0");
> +	asm volatile("bndldx 0x12345678(%rbp), %bnd0");
> +	asm volatile("bndldx 0x12345678(%rcx,%rax,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(%rbp,%rax,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(%rax,%rcx,1), %bnd0");
> +
> +	/* bndstx bnd, mib */
> +
> +	asm volatile("bndstx %bnd0, (%rax)");
> +	asm volatile("bndstx %bnd0, (%r8)");
> +	asm volatile("bndstx %bnd0, (0x12345678)");
> +	asm volatile("bndstx %bnd3, (%rax)");
> +	asm volatile("bndstx %bnd0, (%rcx,%rax,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(,%rax,1)");
> +	asm volatile("bndstx %bnd0, (%rax,%rcx,1)");
> +	asm volatile("bndstx %bnd0, 0x12(%rax)");
> +	asm volatile("bndstx %bnd0, 0x12(%rbp)");
> +	asm volatile("bndstx %bnd0, 0x12(%rcx,%rax,1)");
> +	asm volatile("bndstx %bnd0, 0x12(%rbp,%rax,1)");
> +	asm volatile("bndstx %bnd0, 0x12(%rax,%rcx,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%rax)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%rbp)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%rcx,%rax,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%rbp,%rax,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%rax,%rcx,1)");
> +
> +	/* bnd prefix on call, ret, jmp and all jcc */
> +
> +	asm volatile("bnd call label1");  /* Expecting: call unconditional 0 */
> +	asm volatile("bnd call *(%eax)"); /* Expecting: call indirect      0 */
> +	asm volatile("bnd ret");          /* Expecting: ret  indirect      0 */
> +	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0 */
> +	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0 */
> +	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
> +	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0 */
> +
> +#else  /* #ifdef __x86_64__ */
> +
> +	/* bndmk m32, bnd */
> +
> +	asm volatile("bndmk (%eax), %bnd0");
> +	asm volatile("bndmk (0x12345678), %bnd0");
> +	asm volatile("bndmk (%eax), %bnd3");
> +	asm volatile("bndmk (%ecx,%eax,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(,%eax,1), %bnd0");
> +	asm volatile("bndmk (%eax,%ecx,1), %bnd0");
> +	asm volatile("bndmk (%eax,%ecx,8), %bnd0");
> +	asm volatile("bndmk 0x12(%eax), %bnd0");
> +	asm volatile("bndmk 0x12(%ebp), %bnd0");
> +	asm volatile("bndmk 0x12(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndmk 0x12(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndmk 0x12(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndmk 0x12(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndmk 0x12345678(%eax), %bnd0");
> +	asm volatile("bndmk 0x12345678(%ebp), %bnd0");
> +	asm volatile("bndmk 0x12345678(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndmk 0x12345678(%eax,%ecx,8), %bnd0");
> +
> +	/* bndcl r/m32, bnd */
> +
> +	asm volatile("bndcl (%eax), %bnd0");
> +	asm volatile("bndcl (0x12345678), %bnd0");
> +	asm volatile("bndcl (%eax), %bnd3");
> +	asm volatile("bndcl (%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(,%eax,1), %bnd0");
> +	asm volatile("bndcl (%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcl (%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcl 0x12(%eax), %bnd0");
> +	asm volatile("bndcl 0x12(%ebp), %bnd0");
> +	asm volatile("bndcl 0x12(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcl 0x12(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndcl 0x12(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcl 0x12(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcl 0x12345678(%eax), %bnd0");
> +	asm volatile("bndcl 0x12345678(%ebp), %bnd0");
> +	asm volatile("bndcl 0x12345678(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcl 0x12345678(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcl %eax, %bnd0");
> +
> +	/* bndcu r/m32, bnd */
> +
> +	asm volatile("bndcu (%eax), %bnd0");
> +	asm volatile("bndcu (0x12345678), %bnd0");
> +	asm volatile("bndcu (%eax), %bnd3");
> +	asm volatile("bndcu (%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(,%eax,1), %bnd0");
> +	asm volatile("bndcu (%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcu (%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcu 0x12(%eax), %bnd0");
> +	asm volatile("bndcu 0x12(%ebp), %bnd0");
> +	asm volatile("bndcu 0x12(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcu 0x12(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndcu 0x12(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcu 0x12(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcu 0x12345678(%eax), %bnd0");
> +	asm volatile("bndcu 0x12345678(%ebp), %bnd0");
> +	asm volatile("bndcu 0x12345678(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcu 0x12345678(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcu %eax, %bnd0");
> +
> +	/* bndcn r/m32, bnd */
> +
> +	asm volatile("bndcn (%eax), %bnd0");
> +	asm volatile("bndcn (0x12345678), %bnd0");
> +	asm volatile("bndcn (%eax), %bnd3");
> +	asm volatile("bndcn (%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(,%eax,1), %bnd0");
> +	asm volatile("bndcn (%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcn (%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcn 0x12(%eax), %bnd0");
> +	asm volatile("bndcn 0x12(%ebp), %bnd0");
> +	asm volatile("bndcn 0x12(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcn 0x12(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndcn 0x12(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcn 0x12(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcn 0x12345678(%eax), %bnd0");
> +	asm volatile("bndcn 0x12345678(%ebp), %bnd0");
> +	asm volatile("bndcn 0x12345678(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndcn 0x12345678(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndcn %eax, %bnd0");
> +
> +	/* bndmov m64, bnd */
> +
> +	asm volatile("bndmov (%eax), %bnd0");
> +	asm volatile("bndmov (0x12345678), %bnd0");
> +	asm volatile("bndmov (%eax), %bnd3");
> +	asm volatile("bndmov (%ecx,%eax,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(,%eax,1), %bnd0");
> +	asm volatile("bndmov (%eax,%ecx,1), %bnd0");
> +	asm volatile("bndmov (%eax,%ecx,8), %bnd0");
> +	asm volatile("bndmov 0x12(%eax), %bnd0");
> +	asm volatile("bndmov 0x12(%ebp), %bnd0");
> +	asm volatile("bndmov 0x12(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndmov 0x12(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndmov 0x12(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndmov 0x12(%eax,%ecx,8), %bnd0");
> +	asm volatile("bndmov 0x12345678(%eax), %bnd0");
> +	asm volatile("bndmov 0x12345678(%ebp), %bnd0");
> +	asm volatile("bndmov 0x12345678(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndmov 0x12345678(%eax,%ecx,8), %bnd0");
> +
> +	/* bndmov bnd, m64 */
> +
> +	asm volatile("bndmov %bnd0, (%eax)");
> +	asm volatile("bndmov %bnd0, (0x12345678)");
> +	asm volatile("bndmov %bnd3, (%eax)");
> +	asm volatile("bndmov %bnd0, (%ecx,%eax,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(,%eax,1)");
> +	asm volatile("bndmov %bnd0, (%eax,%ecx,1)");
> +	asm volatile("bndmov %bnd0, (%eax,%ecx,8)");
> +	asm volatile("bndmov %bnd0, 0x12(%eax)");
> +	asm volatile("bndmov %bnd0, 0x12(%ebp)");
> +	asm volatile("bndmov %bnd0, 0x12(%ecx,%eax,1)");
> +	asm volatile("bndmov %bnd0, 0x12(%ebp,%eax,1)");
> +	asm volatile("bndmov %bnd0, 0x12(%eax,%ecx,1)");
> +	asm volatile("bndmov %bnd0, 0x12(%eax,%ecx,8)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%eax)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%ebp)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%ecx,%eax,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%ebp,%eax,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%eax,%ecx,1)");
> +	asm volatile("bndmov %bnd0, 0x12345678(%eax,%ecx,8)");
> +
> +	/* bndmov bnd2, bnd1 */
> +
> +	asm volatile("bndmov %bnd0, %bnd1");
> +	asm volatile("bndmov %bnd1, %bnd0");
> +
> +	/* bndldx mib, bnd */
> +
> +	asm volatile("bndldx (%eax), %bnd0");
> +	asm volatile("bndldx (0x12345678), %bnd0");
> +	asm volatile("bndldx (%eax), %bnd3");
> +	asm volatile("bndldx (%ecx,%eax,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(,%eax,1), %bnd0");
> +	asm volatile("bndldx (%eax,%ecx,1), %bnd0");
> +	asm volatile("bndldx 0x12(%eax), %bnd0");
> +	asm volatile("bndldx 0x12(%ebp), %bnd0");
> +	asm volatile("bndldx 0x12(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndldx 0x12(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndldx 0x12(%eax,%ecx,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(%eax), %bnd0");
> +	asm volatile("bndldx 0x12345678(%ebp), %bnd0");
> +	asm volatile("bndldx 0x12345678(%ecx,%eax,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(%ebp,%eax,1), %bnd0");
> +	asm volatile("bndldx 0x12345678(%eax,%ecx,1), %bnd0");
> +
> +	/* bndstx bnd, mib */
> +
> +	asm volatile("bndstx %bnd0, (%eax)");
> +	asm volatile("bndstx %bnd0, (0x12345678)");
> +	asm volatile("bndstx %bnd3, (%eax)");
> +	asm volatile("bndstx %bnd0, (%ecx,%eax,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(,%eax,1)");
> +	asm volatile("bndstx %bnd0, (%eax,%ecx,1)");
> +	asm volatile("bndstx %bnd0, 0x12(%eax)");
> +	asm volatile("bndstx %bnd0, 0x12(%ebp)");
> +	asm volatile("bndstx %bnd0, 0x12(%ecx,%eax,1)");
> +	asm volatile("bndstx %bnd0, 0x12(%ebp,%eax,1)");
> +	asm volatile("bndstx %bnd0, 0x12(%eax,%ecx,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%eax)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%ebp)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%ecx,%eax,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%ebp,%eax,1)");
> +	asm volatile("bndstx %bnd0, 0x12345678(%eax,%ecx,1)");
> +
> +	/* bnd prefix on call, ret, jmp and all jcc */
> +
> +	asm volatile("bnd call label1");  /* Expecting: call unconditional 0xfffffffc */
> +	asm volatile("bnd call *(%eax)"); /* Expecting: call indirect      0 */
> +	asm volatile("bnd ret");          /* Expecting: ret  indirect      0 */
> +	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0xfffffffc */
> +	asm volatile("bnd jmp label1");   /* Expecting: jmp  unconditional 0xfffffffc */
> +	asm volatile("bnd jmp *(%ecx)");  /* Expecting: jmp  indirect      0 */
> +	asm volatile("bnd jne label1");   /* Expecting: jcc  conditional   0xfffffffc */
> +
> +#endif /* #ifndef __x86_64__ */
> +
> +	/* Following line is a marker for the awk script - do not change */
> +	asm volatile("rdtsc"); /* Stop here */
> +
> +	return 0;
> +}
> diff --git a/tools/perf/tests/insn-x86.c b/tools/perf/tests/insn-x86.c
> new file mode 100644
> index 000000000000..0e126a099874
> --- /dev/null
> +++ b/tools/perf/tests/insn-x86.c
> @@ -0,0 +1,180 @@
> +#include <linux/types.h>
> +
> +#include "debug.h"
> +#include "tests.h"
> +
> +#include "intel-pt-decoder/insn.h"
> +#include "intel-pt-decoder/intel-pt-insn-decoder.h"
> +
> +struct test_data {
> +	u8 data[MAX_INSN_SIZE];
> +	int expected_length;
> +	int expected_rel;
> +	const char *expected_op_str;
> +	const char *expected_branch_str;
> +	const char *asm_rep;
> +};
> +
> +struct test_data test_data_32[] = {
> +#include "insn-x86-dat-32.c"
> +	{{0}, 0, 0, NULL, NULL, NULL},
> +};
> +
> +struct test_data test_data_64[] = {
> +#include "insn-x86-dat-64.c"
> +	{{0}, 0, 0, NULL, NULL, NULL},
> +};
> +
> +static int get_op(const char *op_str)
> +{
> +	struct val_data {
> +		const char *name;
> +		int val;
> +	} vals[] = {
> +		{"other",   INTEL_PT_OP_OTHER},
> +		{"call",    INTEL_PT_OP_CALL},
> +		{"ret",     INTEL_PT_OP_RET},
> +		{"jcc",     INTEL_PT_OP_JCC},
> +		{"jmp",     INTEL_PT_OP_JMP},
> +		{"loop",    INTEL_PT_OP_LOOP},
> +		{"iret",    INTEL_PT_OP_IRET},
> +		{"int",     INTEL_PT_OP_INT},
> +		{"syscall", INTEL_PT_OP_SYSCALL},
> +		{"sysret",  INTEL_PT_OP_SYSRET},
> +		{NULL, 0},
> +	};
> +	struct val_data *val;
> +
> +	if (!op_str || !strlen(op_str))
> +		return 0;
> +
> +	for (val = vals; val->name; val++) {
> +		if (!strcmp(val->name, op_str))
> +			return val->val;
> +	}
> +
> +	pr_debug("Failed to get op\n");
> +
> +	return -1;
> +}
> +
> +static int get_branch(const char *branch_str)
> +{
> +	struct val_data {
> +		const char *name;
> +		int val;
> +	} vals[] = {
> +		{"no_branch",     INTEL_PT_BR_NO_BRANCH},
> +		{"indirect",      INTEL_PT_BR_INDIRECT},
> +		{"conditional",   INTEL_PT_BR_CONDITIONAL},
> +		{"unconditional", INTEL_PT_BR_UNCONDITIONAL},
> +		{NULL, 0},
> +	};
> +	struct val_data *val;
> +
> +	if (!branch_str || !strlen(branch_str))
> +		return 0;
> +
> +	for (val = vals; val->name; val++) {
> +		if (!strcmp(val->name, branch_str))
> +			return val->val;
> +	}
> +
> +	pr_debug("Failed to get branch\n");
> +
> +	return -1;
> +}
> +
> +static int test_data_item(struct test_data *dat, int x86_64)
> +{
> +	struct intel_pt_insn intel_pt_insn;
> +	struct insn insn;
> +	int op, branch;
> +
> +	insn_init(&insn, dat->data, MAX_INSN_SIZE, x86_64);
> +	insn_get_length(&insn);
> +
> +	if (!insn_complete(&insn)) {
> +		pr_debug("Failed to decode: %s\n", dat->asm_rep);
> +		return -1;
> +	}
> +
> +	if (insn.length != dat->expected_length) {
> +		pr_debug("Failed to decode length (%d vs expected %d): %s\n",
> +			 insn.length, dat->expected_length, dat->asm_rep);
> +		return -1;
> +	}
> +
> +	op = get_op(dat->expected_op_str);
> +	branch = get_branch(dat->expected_branch_str);
> +
> +	if (intel_pt_get_insn(dat->data, MAX_INSN_SIZE, x86_64, &intel_pt_insn)) {
> +		pr_debug("Intel PT failed to decode: %s\n", dat->asm_rep);
> +		return -1;
> +	}
> +
> +	if ((int)intel_pt_insn.op != op) {
> +		pr_debug("Failed to decode 'op' value (%d vs expected %d): %s\n",
> +			 intel_pt_insn.op, op, dat->asm_rep);
> +		return -1;
> +	}
> +
> +	if ((int)intel_pt_insn.branch != branch) {
> +		pr_debug("Failed to decode 'branch' value (%d vs expected %d): %s\n",
> +			 intel_pt_insn.branch, branch, dat->asm_rep);
> +		return -1;
> +	}
> +
> +	if (intel_pt_insn.rel != dat->expected_rel) {
> +		pr_debug("Failed to decode 'rel' value (%#x vs expected %#x): %s\n",
> +			 intel_pt_insn.rel, dat->expected_rel, dat->asm_rep);
> +		return -1;
> +	}
> +
> +	pr_debug("Decoded ok: %s\n", dat->asm_rep);
> +
> +	return 0;
> +}
> +
> +static int test_data_set(struct test_data *dat_set, int x86_64)
> +{
> +	struct test_data *dat;
> +	int ret = 0;
> +
> +	for (dat = dat_set; dat->expected_length; dat++) {
> +		if (test_data_item(dat, x86_64))
> +			ret = -1;
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * test__insn_x86 - test x86 instruction decoder - new instructions.
> + *
> + * This function implements a test that decodes a selection of instructions and
> + * checks the results.  The Intel PT function that further categorizes
> + * instructions (i.e. intel_pt_get_insn()) is also checked.
> + *
> + * The instructions are originally in insn-x86-dat-src.c which has been
> + * processed by scripts gen-insn-x86-dat.sh and gen-insn-x86-dat.awk to produce
> + * insn-x86-dat-32.c and insn-x86-dat-64.c which are included into this program.
> + * i.e. to add new instructions to the test, edit insn-x86-dat-src.c, run the
> + * gen-insn-x86-dat.sh script, make perf, and then run the test.
> + *
> + * If the test passes %0 is returned, otherwise %-1 is returned.  Use the
> + * verbose (-v) option to see all the instructions and whether or not they
> + * decoded successfuly.
> + */
> +int test__insn_x86(void)
> +{
> +	int ret = 0;
> +
> +	if (test_data_set(test_data_32, 0))
> +		ret = -1;
> +
> +	if (test_data_set(test_data_64, 1))
> +		ret = -1;
> +
> +	return ret;
> +}
> diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
> index bf113a247987..4e2c5458269a 100644
> --- a/tools/perf/tests/tests.h
> +++ b/tools/perf/tests/tests.h
> @@ -63,6 +63,7 @@ int test__fdarray__add(void);
>  int test__kmod_path__parse(void);
>  int test__thread_map(void);
>  int test__llvm(void);
> +int test__insn_x86(void);
> 
>  #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
>  #ifdef HAVE_DWARF_UNWIND_SUPPORT
> --
> 1.9.1

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/4] perf tools: Add a test for decoding of new x86 instructions
  2015-09-01  0:18   ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-01  8:17     ` Adrian Hunter
  2015-09-01 11:03       ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 1 reply; 27+ messages in thread
From: Adrian Hunter @ 2015-09-01  8:17 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI,
	Arnaldo Carvalho de Melo
  Cc: linux-kernel@vger.kernel.org, Jiri Olsa, Andy Lutomirski,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

On 01/09/15 03:18, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Adrian Hunter [mailto:adrian.hunter@intel.com]
>>
>> Add a new test titled:
>>
>> 	Test x86 instruction decoder - new instructions
>>
>> The purpose of this test is to check the instruction decoder
>> after new instructions have been added.  Initially, MPX
>> instructions are tested which are already supported, but the
>> definitions in x86-opcode-map.txt will be tweaked in a
>> subsequent patch, after which this test can be run to verify
>> those changes.
> 
> Hmm, btw, why should this test in perf? It seems that we need
> this test in kselftest or build-time selftest.
> I prefer to put this in arch/x86/tools/ or lib/. What would you
> think ?

There are 2 reasons perf tools needs a test:
	1. perf tools is source code independent from the kernel i.e. it has its
own copy of the instruction decoder.
	2. perf tools test also tests the Intel PT decoder's categorization of
instructions.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-08-31 13:58 [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Adrian Hunter
                   ` (4 preceding siblings ...)
  2015-08-31 14:43 ` [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Arnaldo Carvalho de Melo
@ 2015-09-01  8:54 ` Ingo Molnar
  2015-09-01 11:38   ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-01 12:16   ` Adrian Hunter
  5 siblings, 2 replies; 27+ messages in thread
From: Ingo Molnar @ 2015-09-01  8:54 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Jiri Olsa,
	Andy Lutomirski, Masami Hiramatsu, Denys Vlasenko, Peter Zijlstra,
	Dave Hansen, Qiaowei Ren, H. Peter Anvin, Thomas Gleixner


* Adrian Hunter <adrian.hunter@intel.com> wrote:

> Hi
> 
> perf tools has a copy of the x86 instruction decoder for decoding
> Intel PT. [...]

So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses 
for kprobes et al - and the two versions already forked slightly:

-#include "inat.h"
-#include "insn.h"
+#include <asm/inat.h>
+#include <asm/insn.h>

it would be nice to add a diff check to the perf build, and (non-fatally) warn 
during the build if the two versions depart from each other?

This will make sure the two versions are fully in sync in the long run as well.

( Alternatively we could perhaps also librarize it into tools/lib/, and teach the 
  kernel build to pick that one up? )

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 1/4] perf tools: Add a test for decoding of new x86 instructions
  2015-09-01  8:17     ` Adrian Hunter
@ 2015-09-01 11:03       ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 0 replies; 27+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-01 11:03 UTC (permalink / raw)
  To: 'Adrian Hunter', Arnaldo Carvalho de Melo
  Cc: linux-kernel@vger.kernel.org, Jiri Olsa, Andy Lutomirski,
	Denys Vlasenko, Peter Zijlstra, Ingo Molnar, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1443 bytes --]

> From: Adrian Hunter [mailto:adrian.hunter@intel.com]
> On 01/09/15 03:18, 平松雅巳 / HIRAMATU,MASAMI wrote:
> >> From: Adrian Hunter [mailto:adrian.hunter@intel.com]
> >>
> >> Add a new test titled:
> >>
> >> 	Test x86 instruction decoder - new instructions
> >>
> >> The purpose of this test is to check the instruction decoder
> >> after new instructions have been added.  Initially, MPX
> >> instructions are tested which are already supported, but the
> >> definitions in x86-opcode-map.txt will be tweaked in a
> >> subsequent patch, after which this test can be run to verify
> >> those changes.
> >
> > Hmm, btw, why should this test in perf? It seems that we need
> > this test in kselftest or build-time selftest.
> > I prefer to put this in arch/x86/tools/ or lib/. What would you
> > think ?
> 
> There are 2 reasons perf tools needs a test:
> 	1. perf tools is source code independent from the kernel i.e. it has its
> own copy of the instruction decoder.
> 	2. perf tools test also tests the Intel PT decoder's categorization of
> instructions.

OK, then, can I port this insn tests into the kbuild? I'd like to use this,
but because of finding bugs in early stage, I think same test should be
done in the kernel build process (as a kbuild option).

Thank you,

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01  8:54 ` Ingo Molnar
@ 2015-09-01 11:38   ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-01 12:10     ` Adrian Hunter
  2015-09-01 12:16   ` Adrian Hunter
  1 sibling, 1 reply; 27+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-01 11:38 UTC (permalink / raw)
  To: 'Ingo Molnar', Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, linux-kernel@vger.kernel.org, Jiri Olsa,
	Andy Lutomirski, Denys Vlasenko, Peter Zijlstra, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

> From: Ingo Molnar [mailto:mingo.kernel.org@gmail.com] On Behalf Of Ingo Molnar
> 
> 
> * Adrian Hunter <adrian.hunter@intel.com> wrote:
> 
> > Hi
> >
> > perf tools has a copy of the x86 instruction decoder for decoding
> > Intel PT. [...]
> 
> So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses
> for kprobes et al - and the two versions already forked slightly:
> 
> -#include "inat.h"
> -#include "insn.h"
> +#include <asm/inat.h>
> +#include <asm/insn.h>
> 
> it would be nice to add a diff check to the perf build, and (non-fatally) warn
> during the build if the two versions depart from each other?
> 
> This will make sure the two versions are fully in sync in the long run as well.
> 
> ( Alternatively we could perhaps also librarize it into tools/lib/, and teach the
>   kernel build to pick that one up? )

Agreed, what I concern is that someone finds a bug and fixes one of them and
another is not fixed.

I'll see the forked version and check if it can be merged into the kernel.

Thank you,

> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 11:38   ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-01 12:10     ` Adrian Hunter
  2015-09-01 12:55       ` Ingo Molnar
  2015-09-01 15:13       ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 2 replies; 27+ messages in thread
From: Adrian Hunter @ 2015-09-01 12:10 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI
  Cc: 'Ingo Molnar', Arnaldo Carvalho de Melo,
	linux-kernel@vger.kernel.org, Jiri Olsa, Andy Lutomirski,
	Denys Vlasenko, Peter Zijlstra, Dave Hansen, Qiaowei Ren,
	H. Peter Anvin, Thomas Gleixner

On 01/09/15 14:38, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Ingo Molnar [mailto:mingo.kernel.org@gmail.com] On Behalf Of Ingo Molnar
>>
>>
>> * Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>>> Hi
>>>
>>> perf tools has a copy of the x86 instruction decoder for decoding
>>> Intel PT. [...]
>>
>> So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses
>> for kprobes et al - and the two versions already forked slightly:
>>
>> -#include "inat.h"
>> -#include "insn.h"
>> +#include <asm/inat.h>
>> +#include <asm/insn.h>
>>
>> it would be nice to add a diff check to the perf build, and (non-fatally) warn
>> during the build if the two versions depart from each other?
>>
>> This will make sure the two versions are fully in sync in the long run as well.
>>
>> ( Alternatively we could perhaps also librarize it into tools/lib/, and teach the
>>   kernel build to pick that one up? )
> 
> Agreed, what I concern is that someone finds a bug and fixes one of them and
> another is not fixed.
> 
> I'll see the forked version and check if it can be merged into the kernel.

Ever since Linus complained about perf tools including kernel headers, I
have assumed we should have separate source code.  That email thread was not
cc'ed to a mailing list but here is a quote:

Em Sat, Jul 04, 2015 at 08:53:46AM -0700, Linus Torvalds escreveu:
> So this is more fundamental, and looks like it's just due to perf
> abusing the kernel headers, and now that rbtree has rcu support
> ("rbtree: Make lockless searches non-fatal"), it gets tons of headers
> included that really don't work from user space.
>
> There might be other things going on, but the rbtree one seems to be a
> big one. I think perf needs to get its own rbtree header or something,
> instead of doing that insane "let's include random core kernel
> headers" thing.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01  8:54 ` Ingo Molnar
  2015-09-01 11:38   ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-01 12:16   ` Adrian Hunter
  2015-09-01 13:56     ` Arnaldo Carvalho de Melo
                       ` (2 more replies)
  1 sibling, 3 replies; 27+ messages in thread
From: Adrian Hunter @ 2015-09-01 12:16 UTC (permalink / raw)
  To: Ingo Molnar, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: linux-kernel, Andy Lutomirski, Masami Hiramatsu, Denys Vlasenko,
	Peter Zijlstra, Dave Hansen, Qiaowei Ren, H. Peter Anvin,
	Thomas Gleixner

On 01/09/15 11:54, Ingo Molnar wrote:
> 
> * Adrian Hunter <adrian.hunter@intel.com> wrote:
> 
>> Hi
>>
>> perf tools has a copy of the x86 instruction decoder for decoding
>> Intel PT. [...]
> 
> So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses 
> for kprobes et al - and the two versions already forked slightly:
> 
> -#include "inat.h"
> -#include "insn.h"
> +#include <asm/inat.h>
> +#include <asm/insn.h>
> 
> it would be nice to add a diff check to the perf build, and (non-fatally) warn 
> during the build if the two versions depart from each other?

I had a go and came up with this.  Arnaldo, Jiri any comments?

diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
index 240730d682c1..1b8a32de8504 100644
--- a/tools/perf/util/intel-pt-decoder/Build
+++ b/tools/perf/util/intel-pt-decoder/Build
@@ -6,6 +6,17 @@ inat_tables_maps = util/intel-pt-decoder/x86-opcode-map.txt
 $(OUTPUT)util/intel-pt-decoder/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
 	@$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@ || rm -f $@
 
-$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
+$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
+	@test -d ../../arch/x86 && (( \
+	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
+	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
+	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
+	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
+	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
+	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
+	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
+	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )
+	$(call rule_mkdir)
+	$(call if_changed_dep,cc_o_c)
 
 CFLAGS_intel-pt-insn-decoder.o += -I$(OUTPUT)util/intel-pt-decoder -Wno-override-init



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 12:10     ` Adrian Hunter
@ 2015-09-01 12:55       ` Ingo Molnar
  2015-09-01 15:13       ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2015-09-01 12:55 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: 平松雅巳 / HIRAMATU,MASAMI,
	Arnaldo Carvalho de Melo, linux-kernel@vger.kernel.org, Jiri Olsa,
	Andy Lutomirski, Denys Vlasenko, Peter Zijlstra, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner, Linus Torvalds


* Adrian Hunter <adrian.hunter@intel.com> wrote:

> > Agreed, what I concern is that someone finds a bug and fixes one of them and 
> > another is not fixed.
> > 
> > I'll see the forked version and check if it can be merged into the kernel.
> 
> Ever since Linus complained about perf tools including kernel headers, I have 
> assumed we should have separate source code.  That email thread was not cc'ed to 
> a mailing list but here is a quote:
> 
> Em Sat, Jul 04, 2015 at 08:53:46AM -0700, Linus Torvalds escreveu:
>
> > So this is more fundamental, and looks like it's just due to perf abusing the 
> > kernel headers, and now that rbtree has rcu support ("rbtree: Make lockless 
> > searches non-fatal"), it gets tons of headers included that really don't work 
> > from user space.
> >
> > There might be other things going on, but the rbtree one seems to be a big 
> > one. I think perf needs to get its own rbtree header or something, instead of 
> > doing that insane "let's include random core kernel headers" thing.

Note that even plain copying and occasional back-merges isn't a bad solution: it's 
better than 'messy sharing' of code.

But we can also share code in a bit more organized fashion, and any of the two 
solutions I proposed solve these complications:

 - if we do the diff -u check warning during perf build then the forked versions
   won't stay forked for long. This is the simplest variant.

 - if we librarize this functionality into tools/lib/x86/decode/ (and make sure 
   it's a library that can be linked into the kernel) then we are back to shared 
   code.

The problem wasn't to share code per se, the problem was to share code in a messy 
way, without making it apparent that it's shared code: which made it easy to break 
the tools/perf build via harmless looking kernel side changes.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 12:16   ` Adrian Hunter
@ 2015-09-01 13:56     ` Arnaldo Carvalho de Melo
  2015-09-01 13:59     ` Jiri Olsa
  2015-09-01 19:57     ` Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 13:56 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Jiri Olsa, linux-kernel, Andy Lutomirski,
	Masami Hiramatsu, Denys Vlasenko, Peter Zijlstra, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Em Tue, Sep 01, 2015 at 03:16:52PM +0300, Adrian Hunter escreveu:
> On 01/09/15 11:54, Ingo Molnar wrote:
> > * Adrian Hunter <adrian.hunter@intel.com> wrote:
> >> perf tools has a copy of the x86 instruction decoder for decoding
> >> Intel PT. [...]

> > So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses 
> > for kprobes et al - and the two versions already forked slightly:

> > -#include "inat.h"
> > -#include "insn.h"
> > +#include <asm/inat.h>
> > +#include <asm/insn.h>

> > it would be nice to add a diff check to the perf build, and (non-fatally) warn 
> > during the build if the two versions depart from each other?
 
> I had a go and came up with this.  Arnaldo, Jiri any comments?

It looks ok, but then, if the people doing the original work, i.e.
Masami, IIRC, manages to make these files something shared, then this
becomes moot, right?

We would go back to sharing stuff with the kernel, but this time around
we would be using something that everybody knows is being shared, which
doesn't elliminates the possibility that at some point changes made with
the kernel in mind would break the tools/ using code.

Perhaps it is better to keep copying what we want and introduce
infrastructure to check for differences and warn us as soon as possible
and do the copy, test if it doesn't break what we use, etc.

I.e. we wouldn't be putting any new burden on the "kernel people", i.e.
the burden of having to check that changed they made doesn't break
tools/ living code, nor any out of the blue breakage on tools/
developers when changes are made on the kernel "side".

I.e. have something like what you did, but not limited to these intel-pt
decoder bits, we share more than that :-)

So, I would apply your patch and move forward, at least these intel-pt
bits would be covered, Ingo?

- Arnaldo
 
> diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
> index 240730d682c1..1b8a32de8504 100644
> --- a/tools/perf/util/intel-pt-decoder/Build
> +++ b/tools/perf/util/intel-pt-decoder/Build
> @@ -6,6 +6,17 @@ inat_tables_maps = util/intel-pt-decoder/x86-opcode-map.txt
>  $(OUTPUT)util/intel-pt-decoder/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
>  	@$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@ || rm -f $@
>  
> -$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> +$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> +	@test -d ../../arch/x86 && (( \
> +	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
> +	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
> +	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
> +	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )
> +	$(call rule_mkdir)
> +	$(call if_changed_dep,cc_o_c)
>  
>  CFLAGS_intel-pt-insn-decoder.o += -I$(OUTPUT)util/intel-pt-decoder -Wno-override-init
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 12:16   ` Adrian Hunter
  2015-09-01 13:56     ` Arnaldo Carvalho de Melo
@ 2015-09-01 13:59     ` Jiri Olsa
  2015-09-01 14:55       ` Arnaldo Carvalho de Melo
  2015-09-01 19:57     ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2015-09-01 13:59 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel,
	Andy Lutomirski, Masami Hiramatsu, Denys Vlasenko, Peter Zijlstra,
	Dave Hansen, Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

On Tue, Sep 01, 2015 at 03:16:52PM +0300, Adrian Hunter wrote:
> On 01/09/15 11:54, Ingo Molnar wrote:
> > 
> > * Adrian Hunter <adrian.hunter@intel.com> wrote:
> > 
> >> Hi
> >>
> >> perf tools has a copy of the x86 instruction decoder for decoding
> >> Intel PT. [...]
> > 
> > So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses 
> > for kprobes et al - and the two versions already forked slightly:
> > 
> > -#include "inat.h"
> > -#include "insn.h"
> > +#include <asm/inat.h>
> > +#include <asm/insn.h>
> > 
> > it would be nice to add a diff check to the perf build, and (non-fatally) warn 
> > during the build if the two versions depart from each other?
> 
> I had a go and came up with this.  Arnaldo, Jiri any comments?
> 
> diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
> index 240730d682c1..1b8a32de8504 100644
> --- a/tools/perf/util/intel-pt-decoder/Build
> +++ b/tools/perf/util/intel-pt-decoder/Build
> @@ -6,6 +6,17 @@ inat_tables_maps = util/intel-pt-decoder/x86-opcode-map.txt
>  $(OUTPUT)util/intel-pt-decoder/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
>  	@$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@ || rm -f $@
>  
> -$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> +$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> +	@test -d ../../arch/x86 && (( \
> +	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
> +	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
> +	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
> +	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )
> +	$(call rule_mkdir)
> +	$(call if_changed_dep,cc_o_c)
>  

seems ok, but it might be nicer to have make function for that
so we could use it on other places like rbtree.h

jirka

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 13:59     ` Jiri Olsa
@ 2015-09-01 14:55       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 14:55 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Adrian Hunter, Ingo Molnar, linux-kernel, Andy Lutomirski,
	Masami Hiramatsu, Denys Vlasenko, Peter Zijlstra, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Em Tue, Sep 01, 2015 at 03:59:16PM +0200, Jiri Olsa escreveu:
> On Tue, Sep 01, 2015 at 03:16:52PM +0300, Adrian Hunter wrote:
> > On 01/09/15 11:54, Ingo Molnar wrote:
> > > it would be nice to add a diff check to the perf build, and (non-fatally) warn 
> > > during the build if the two versions depart from each other?

> > I had a go and came up with this.  Arnaldo, Jiri any comments?

> > diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
> > +	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
> > +	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
> > +	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
> > +	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )
> > +	$(call rule_mkdir)
> > +	$(call if_changed_dep,cc_o_c)
 
> seems ok, but it might be nicer to have make function for that
> so we could use it on other places like rbtree.h

That will pose some more hurdles, as there are things like
EXPORT_SYMBOL() and RCU stuff that are ok in the kernel sources, but not
in the tools/ copy...

I.e. fully sharing will put a new burden for kernel developers working
on the to-be-shared code, which is something that is not desired.

I was ok with, hey, tools/ broke because you're sharing code with the
kernel, as probably a tools/ developer would notice that and fix things,
but Linus advised against that.

- Arnaldo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 12:10     ` Adrian Hunter
  2015-09-01 12:55       ` Ingo Molnar
@ 2015-09-01 15:13       ` 平松雅巳 / HIRAMATU,MASAMI
  1 sibling, 0 replies; 27+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-01 15:13 UTC (permalink / raw)
  To: 'Adrian Hunter'
  Cc: 'Ingo Molnar', Arnaldo Carvalho de Melo,
	linux-kernel@vger.kernel.org, Jiri Olsa, Andy Lutomirski,
	Denys Vlasenko, Peter Zijlstra, Dave Hansen, Qiaowei Ren,
	H. Peter Anvin, Thomas Gleixner

> From: Adrian Hunter [mailto:adrian.hunter@intel.com]
> 
> On 01/09/15 14:38, 平松雅巳 / HIRAMATU,MASAMI wrote:
> >> From: Ingo Molnar [mailto:mingo.kernel.org@gmail.com] On Behalf Of Ingo Molnar
> >>
> >>
> >> * Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >>> Hi
> >>>
> >>> perf tools has a copy of the x86 instruction decoder for decoding
> >>> Intel PT. [...]
> >>
> >> So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses
> >> for kprobes et al - and the two versions already forked slightly:
> >>
> >> -#include "inat.h"
> >> -#include "insn.h"
> >> +#include <asm/inat.h>
> >> +#include <asm/insn.h>
> >>
> >> it would be nice to add a diff check to the perf build, and (non-fatally) warn
> >> during the build if the two versions depart from each other?
> >>
> >> This will make sure the two versions are fully in sync in the long run as well.
> >>
> >> ( Alternatively we could perhaps also librarize it into tools/lib/, and teach the
> >>   kernel build to pick that one up? )
> >
> > Agreed, what I concern is that someone finds a bug and fixes one of them and
> > another is not fixed.
> >
> > I'll see the forked version and check if it can be merged into the kernel.
> 
> Ever since Linus complained about perf tools including kernel headers, I
> have assumed we should have separate source code.  That email thread was not
> cc'ed to a mailing list but here is a quote:
> 
> Em Sat, Jul 04, 2015 at 08:53:46AM -0700, Linus Torvalds escreveu:
> > So this is more fundamental, and looks like it's just due to perf
> > abusing the kernel headers, and now that rbtree has rcu support
> > ("rbtree: Make lockless searches non-fatal"), it gets tons of headers
> > included that really don't work from user space.
> >
> > There might be other things going on, but the rbtree one seems to be a
> > big one. I think perf needs to get its own rbtree header or something,
> > instead of doing that insane "let's include random core kernel
> > headers" thing.


OK, now I see what happened...
Hmm, so at this point, I'll just port the test to arch/x86/tools/, since the
kernel should have that.

Thanks,




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 12:16   ` Adrian Hunter
  2015-09-01 13:56     ` Arnaldo Carvalho de Melo
  2015-09-01 13:59     ` Jiri Olsa
@ 2015-09-01 19:57     ` Arnaldo Carvalho de Melo
  2015-09-02  5:59       ` Jiri Olsa
  2 siblings, 1 reply; 27+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-01 19:57 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Jiri Olsa, linux-kernel, Andy Lutomirski,
	Masami Hiramatsu, Denys Vlasenko, Peter Zijlstra, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

Em Tue, Sep 01, 2015 at 03:16:52PM +0300, Adrian Hunter escreveu:
> On 01/09/15 11:54, Ingo Molnar wrote:
> > 
> > * Adrian Hunter <adrian.hunter@intel.com> wrote:
> > 
> >> Hi
> >>
> >> perf tools has a copy of the x86 instruction decoder for decoding
> >> Intel PT. [...]
> > 
> > So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses 
> > for kprobes et al - and the two versions already forked slightly:
> > 
> > -#include "inat.h"
> > -#include "insn.h"
> > +#include <asm/inat.h>
> > +#include <asm/insn.h>
> > 
> > it would be nice to add a diff check to the perf build, and (non-fatally) warn 
> > during the build if the two versions depart from each other?
> 
> I had a go and came up with this.  Arnaldo, Jiri any comments?

So, I was going to try and merge this series, can you please collect the
Acks by Masami and Jiri and resubmit?

I'd say no need to stop this just to get a build function to use with
this, the test below should do the trick _for this specific instance_
and then, after we get this, you should use it as the initial usecase
for adding the build function, d'accord?

Jiri, are you ok with this?

- Arnaldo
 
> diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
> index 240730d682c1..1b8a32de8504 100644
> --- a/tools/perf/util/intel-pt-decoder/Build
> +++ b/tools/perf/util/intel-pt-decoder/Build
> @@ -6,6 +6,17 @@ inat_tables_maps = util/intel-pt-decoder/x86-opcode-map.txt
>  $(OUTPUT)util/intel-pt-decoder/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
>  	@$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@ || rm -f $@
>  
> -$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> +$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> +	@test -d ../../arch/x86 && (( \
> +	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
> +	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
> +	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
> +	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
> +	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )
> +	$(call rule_mkdir)
> +	$(call if_changed_dep,cc_o_c)
>  
>  CFLAGS_intel-pt-insn-decoder.o += -I$(OUTPUT)util/intel-pt-decoder -Wno-override-init
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-01 19:57     ` Arnaldo Carvalho de Melo
@ 2015-09-02  5:59       ` Jiri Olsa
  2015-09-02  6:41         ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Olsa @ 2015-09-02  5:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Adrian Hunter, Ingo Molnar, linux-kernel, Andy Lutomirski,
	Masami Hiramatsu, Denys Vlasenko, Peter Zijlstra, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

On Tue, Sep 01, 2015 at 04:57:16PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 01, 2015 at 03:16:52PM +0300, Adrian Hunter escreveu:
> > On 01/09/15 11:54, Ingo Molnar wrote:
> > > 
> > > * Adrian Hunter <adrian.hunter@intel.com> wrote:
> > > 
> > >> Hi
> > >>
> > >> perf tools has a copy of the x86 instruction decoder for decoding
> > >> Intel PT. [...]
> > > 
> > > So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses 
> > > for kprobes et al - and the two versions already forked slightly:
> > > 
> > > -#include "inat.h"
> > > -#include "insn.h"
> > > +#include <asm/inat.h>
> > > +#include <asm/insn.h>
> > > 
> > > it would be nice to add a diff check to the perf build, and (non-fatally) warn 
> > > during the build if the two versions depart from each other?
> > 
> > I had a go and came up with this.  Arnaldo, Jiri any comments?
> 
> So, I was going to try and merge this series, can you please collect the
> Acks by Masami and Jiri and resubmit?
> 
> I'd say no need to stop this just to get a build function to use with
> this, the test below should do the trick _for this specific instance_
> and then, after we get this, you should use it as the initial usecase
> for adding the build function, d'accord?
> 
> Jiri, are you ok with this?

sure, np you can use my ack

jirka

> 
> - Arnaldo
>  
> > diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
> > index 240730d682c1..1b8a32de8504 100644
> > --- a/tools/perf/util/intel-pt-decoder/Build
> > +++ b/tools/perf/util/intel-pt-decoder/Build
> > @@ -6,6 +6,17 @@ inat_tables_maps = util/intel-pt-decoder/x86-opcode-map.txt
> >  $(OUTPUT)util/intel-pt-decoder/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
> >  	@$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@ || rm -f $@
> >  
> > -$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> > +$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> > +	@test -d ../../arch/x86 && (( \
> > +	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
> > +	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
> > +	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
> > +	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
> > +	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
> > +	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
> > +	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
> > +	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )
> > +	$(call rule_mkdir)
> > +	$(call if_changed_dep,cc_o_c)
> >  
> >  CFLAGS_intel-pt-insn-decoder.o += -I$(OUTPUT)util/intel-pt-decoder -Wno-override-init
> > 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-02  5:59       ` Jiri Olsa
@ 2015-09-02  6:41         ` 平松雅巳 / HIRAMATU,MASAMI
  2015-09-02  7:39           ` Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-02  6:41 UTC (permalink / raw)
  To: 'Jiri Olsa', Arnaldo Carvalho de Melo
  Cc: Adrian Hunter, Ingo Molnar, linux-kernel@vger.kernel.org,
	Andy Lutomirski, Denys Vlasenko, Peter Zijlstra, Dave Hansen,
	Qiaowei Ren, H. Peter Anvin, Thomas Gleixner

> From: Jiri Olsa [mailto:jolsa@redhat.com]
> 
> On Tue, Sep 01, 2015 at 04:57:16PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Sep 01, 2015 at 03:16:52PM +0300, Adrian Hunter escreveu:
> > > On 01/09/15 11:54, Ingo Molnar wrote:
> > > >
> > > > * Adrian Hunter <adrian.hunter@intel.com> wrote:
> > > >
> > > >> Hi
> > > >>
> > > >> perf tools has a copy of the x86 instruction decoder for decoding
> > > >> Intel PT. [...]
> > > >
> > > > So that's the arch/x86/lib/insn.c instruction length decoder that the kernel uses
> > > > for kprobes et al - and the two versions already forked slightly:
> > > >
> > > > -#include "inat.h"
> > > > -#include "insn.h"
> > > > +#include <asm/inat.h>
> > > > +#include <asm/insn.h>
> > > >
> > > > it would be nice to add a diff check to the perf build, and (non-fatally) warn
> > > > during the build if the two versions depart from each other?
> > >
> > > I had a go and came up with this.  Arnaldo, Jiri any comments?
> >
> > So, I was going to try and merge this series, can you please collect the
> > Acks by Masami and Jiri and resubmit?
> >
> > I'd say no need to stop this just to get a build function to use with
> > this, the test below should do the trick _for this specific instance_
> > and then, after we get this, you should use it as the initial usecase
> > for adding the build function, d'accord?
> >
> > Jiri, are you ok with this?
> 
> sure, np you can use my ack

I'm also OK for this patch. I just concern that is OK for Adrian too? 
Since this ensures all the copied code should be dead copy (not modified anymore),
if we want a different instruction decoding routine, we have to break the test
anyway.

Thank you,

> 
> jirka
> 
> >
> > - Arnaldo
> >
> > > diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
> > > index 240730d682c1..1b8a32de8504 100644
> > > --- a/tools/perf/util/intel-pt-decoder/Build
> > > +++ b/tools/perf/util/intel-pt-decoder/Build
> > > @@ -6,6 +6,17 @@ inat_tables_maps = util/intel-pt-decoder/x86-opcode-map.txt
> > >  $(OUTPUT)util/intel-pt-decoder/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
> > >  	@$(call echo-cmd,gen)$(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@ || rm -f $@
> > >
> > > -$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/inat.c
> $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> > > +$(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c
> util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
> > > +	@test -d ../../arch/x86 && (( \
> > > +	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
> > > +	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
> > > +	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
> > > +	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
> > > +	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
> > > +	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
> > > +	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
> > > +	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )
> > > +	$(call rule_mkdir)
> > > +	$(call if_changed_dep,cc_o_c)
> > >
> > >  CFLAGS_intel-pt-insn-decoder.o += -I$(OUTPUT)util/intel-pt-decoder -Wno-override-init
> > >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-02  6:41         ` 平松雅巳 / HIRAMATU,MASAMI
@ 2015-09-02  7:39           ` Ingo Molnar
  2015-09-02 10:27             ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2015-09-02  7:39 UTC (permalink / raw)
  To: 平松雅巳 / HIRAMATU,MASAMI
  Cc: 'Jiri Olsa', Arnaldo Carvalho de Melo, Adrian Hunter,
	linux-kernel@vger.kernel.org, Andy Lutomirski, Denys Vlasenko,
	Peter Zijlstra, Dave Hansen, Qiaowei Ren, H. Peter Anvin,
	Thomas Gleixner


* 平松雅巳 / HIRAMATU,MASAMI <masami.hiramatsu.pt@hitachi.com> wrote:

> > sure, np you can use my ack
> 
> I'm also OK for this patch. I just concern that is OK for Adrian too? 
> Since this ensures all the copied code should be dead copy (not modified anymore),
> if we want a different instruction decoding routine, we have to break the test
> anyway.

So the idea would be to not break anything, only warn in a non-fatal question. 
This protects against unbisectable universes being created via simple git merges 
where updates meet but testing of tooling isn't done.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions
  2015-09-02  7:39           ` Ingo Molnar
@ 2015-09-02 10:27             ` 平松雅巳 / HIRAMATU,MASAMI
  0 siblings, 0 replies; 27+ messages in thread
From: 平松雅巳 / HIRAMATU,MASAMI @ 2015-09-02 10:27 UTC (permalink / raw)
  To: 'Ingo Molnar'
  Cc: 'Jiri Olsa', Arnaldo Carvalho de Melo, Adrian Hunter,
	linux-kernel@vger.kernel.org, Andy Lutomirski, Denys Vlasenko,
	Peter Zijlstra, Dave Hansen, Qiaowei Ren, H. Peter Anvin,
	Thomas Gleixner

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 888 bytes --]

> From: Ingo Molnar [mailto:mingo.kernel.org@gmail.com] On Behalf Of Ingo Molnar
> 
> > > sure, np you can use my ack
> >
> > I'm also OK for this patch. I just concern that is OK for Adrian too?
> > Since this ensures all the copied code should be dead copy (not modified anymore),
> > if we want a different instruction decoding routine, we have to break the test
> > anyway.
> 
> So the idea would be to not break anything, only warn in a non-fatal question.
> This protects against unbisectable universes being created via simple git merges
> where updates meet but testing of tooling isn't done.

I see, so I give my ack :)

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thank you!

> 
> Thanks,
> 
> 	Ingo
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2015-09-02 10:27 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-31 13:58 [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Adrian Hunter
2015-08-31 13:58 ` [PATCH 1/4] perf tools: Add a test for decoding of " Adrian Hunter
2015-09-01  0:18   ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-01  8:17     ` Adrian Hunter
2015-09-01 11:03       ` 平松雅巳 / HIRAMATU,MASAMI
2015-08-31 13:58 ` [PATCH 2/4] x86/insn: perf tools: Pedantically tweak opcode map for MPX instructions Adrian Hunter
2015-08-31 14:48   ` Arnaldo Carvalho de Melo
2015-08-31 13:58 ` [PATCH 3/4] x86/insn: perf tools: Add new SHA instructions Adrian Hunter
2015-08-31 14:50   ` Arnaldo Carvalho de Melo
2015-08-31 18:58     ` Adrian Hunter
2015-09-01  0:08   ` 平松雅巳 / HIRAMATU,MASAMI
2015-08-31 13:58 ` [PATCH 4/4] x86/insn: perf tools: Add new memory instructions Adrian Hunter
2015-08-31 14:43 ` [PATCH 0/4] x86/insn: perf tools: Add a few new x86 instructions Arnaldo Carvalho de Melo
2015-09-01  8:54 ` Ingo Molnar
2015-09-01 11:38   ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-01 12:10     ` Adrian Hunter
2015-09-01 12:55       ` Ingo Molnar
2015-09-01 15:13       ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-01 12:16   ` Adrian Hunter
2015-09-01 13:56     ` Arnaldo Carvalho de Melo
2015-09-01 13:59     ` Jiri Olsa
2015-09-01 14:55       ` Arnaldo Carvalho de Melo
2015-09-01 19:57     ` Arnaldo Carvalho de Melo
2015-09-02  5:59       ` Jiri Olsa
2015-09-02  6:41         ` 平松雅巳 / HIRAMATU,MASAMI
2015-09-02  7:39           ` Ingo Molnar
2015-09-02 10:27             ` 平松雅巳 / HIRAMATU,MASAMI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).