* [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions
@ 2024-05-02 10:58 Adrian Hunter
2024-05-02 10:58 ` [PATCH 01/10] x86/insn: Add Key Locker instructions to the opcode map Adrian Hunter
` (10 more replies)
0 siblings, 11 replies; 18+ messages in thread
From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw)
To: linux-kernel
Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov,
Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86,
Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers,
linux-perf-users
Hi
The x86 instruction decoder is used not only for decoding kernel
instructions. It is also used by perf uprobes (user space probes) and by
perf tools Intel Processor Trace decoding. Consequently, it needs to
support instructions executed by user space also.
It should be noted that there are 2 copies of the instruction decoder.
One for the kernel and one for tools, which is a policy to prevent
changes from breaking builds.
Changes are based upon documents:
Intel® Advanced Performance Extensions (Intel® APX)
Architecture Specification
April, 2024 Revision 4.0 355828-004
Intel® Architecture Instruction Set Extensions
and Future Features Programming Reference
March 2024 319433-052
The patches depend on a patch by Chang S. Bae that they have already
submitted in their Key Locker patch set. Nevertheless, the patch is
included in this patch set as well.
The final patch is an update to the perf tools' "x86 instruction decoder
test - new instructions" test, which provides testing.
Adrian Hunter (9):
x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map
x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS
x86/insn: Add misc new Intel instructions
x86/insn: Add support for REX2 prefix to the instruction decoder logic
x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map
x86/insn: Add support for APX EVEX to the instruction decoder logic
x86/insn: Add support for APX EVEX instructions to the opcode map
perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder
perf tests: Add APX and other new instructions to x86 instruction decoder test
Chang S. Bae (1):
x86/insn: Add Key Locker instructions to the opcode map
arch/x86/include/asm/inat.h | 17 +-
arch/x86/include/asm/insn.h | 32 +-
arch/x86/lib/insn.c | 29 +
arch/x86/lib/x86-opcode-map.txt | 315 ++++--
arch/x86/tools/gen-insn-attr-x86.awk | 15 +-
tools/arch/x86/include/asm/inat.h | 17 +-
tools/arch/x86/include/asm/insn.h | 32 +-
tools/arch/x86/lib/insn.c | 29 +
tools/arch/x86/lib/x86-opcode-map.txt | 315 ++++--
tools/arch/x86/tools/gen-insn-attr-x86.awk | 15 +-
tools/perf/arch/x86/tests/insn-x86-dat-32.c | 116 +++
tools/perf/arch/x86/tests/insn-x86-dat-64.c | 1026 ++++++++++++++++++++
tools/perf/arch/x86/tests/insn-x86-dat-src.c | 597 ++++++++++++
.../util/intel-pt-decoder/intel-pt-insn-decoder.c | 9 +
14 files changed, 2370 insertions(+), 194 deletions(-)
Regards
Adrian
^ permalink raw reply [flat|nested] 18+ messages in thread* [PATCH 01/10] x86/insn: Add Key Locker instructions to the opcode map 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 02/10] x86/insn: Fix PUSH instruction in x86 instruction decoder " Adrian Hunter ` (9 subsequent siblings) 10 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users From: "Chang S. Bae" <chang.seok.bae@intel.com> The x86 instruction decoder needs to know these new instructions that are going to be used in the crypto library as well as the x86 core code. Add the following: LOADIWKEY: Load a CPU-internal wrapping key. ENCODEKEY128: Wrap a 128-bit AES key to a key handle. ENCODEKEY256: Wrap a 256-bit AES key to a key handle. AESENC128KL: Encrypt a 128-bit block of data using a 128-bit AES key indicated by a key handle. AESENC256KL: Encrypt a 128-bit block of data using a 256-bit AES key indicated by a key handle. AESDEC128KL: Decrypt a 128-bit block of data using a 128-bit AES key indicated by a key handle. AESDEC256KL: Decrypt a 128-bit block of data using a 256-bit AES key indicated by a key handle. AESENCWIDE128KL: Encrypt 8 128-bit blocks of data using a 128-bit AES key indicated by a key handle. AESENCWIDE256KL: Encrypt 8 128-bit blocks of data using a 256-bit AES key indicated by a key handle. AESDECWIDE128KL: Decrypt 8 128-bit blocks of data using a 128-bit AES key indicated by a key handle. AESDECWIDE256KL: Decrypt 8 128-bit blocks of data using a 256-bit AES key indicated by a key handle. The detail can be found in Intel Software Developer Manual. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/lib/x86-opcode-map.txt | 11 +++++++---- tools/arch/x86/lib/x86-opcode-map.txt | 11 +++++++---- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 12af572201a2..c94988d5130d 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -800,11 +800,12 @@ cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) cf: vgf2p8mulb Vx,Wx (66) +d8: AESENCWIDE128KL Qpi (F3),(000),(00B) | AESENCWIDE256KL Qpi (F3),(000),(10B) | AESDECWIDE128KL Qpi (F3),(000),(01B) | AESDECWIDE256KL Qpi (F3),(000),(11B) db: VAESIMC Vdq,Wdq (66),(v1) -dc: vaesenc Vx,Hx,Wx (66) -dd: vaesenclast Vx,Hx,Wx (66) -de: vaesdec Vx,Hx,Wx (66) -df: vaesdeclast Vx,Hx,Wx (66) +dc: vaesenc Vx,Hx,Wx (66) | LOADIWKEY Vx,Hx (F3) | AESENC128KL Vpd,Qpi (F3) +dd: vaesenclast Vx,Hx,Wx (66) | AESDEC128KL Vpd,Qpi (F3) +de: vaesdec Vx,Hx,Wx (66) | AESENC256KL Vpd,Qpi (F3) +df: vaesdeclast Vx,Hx,Wx (66) | AESDEC256KL Vpd,Qpi (F3) f0: MOVBE Gy,My | MOVBE Gw,Mw (66) | CRC32 Gd,Eb (F2) | CRC32 Gd,Eb (66&F2) f1: MOVBE My,Gy | MOVBE Mw,Gw (66) | CRC32 Gd,Ey (F2) | CRC32 Gd,Ew (66&F2) f2: ANDN Gy,By,Ey (v) @@ -814,6 +815,8 @@ f6: ADCX Gy,Ey (66) | ADOX Gy,Ey (F3) | MULX By,Gy,rDX,Ey (F2),(v) | WRSSD/Q My, f7: BEXTR Gy,Ey,By (v) | SHLX Gy,Ey,By (66),(v) | SARX Gy,Ey,By (F3),(v) | SHRX Gy,Ey,By (F2),(v) f8: MOVDIR64B Gv,Mdqq (66) | ENQCMD Gv,Mdqq (F2) | ENQCMDS Gv,Mdqq (F3) f9: MOVDIRI My,Gy +fa: ENCODEKEY128 Ew,Ew (F3) +fb: ENCODEKEY256 Ew,Ew (F3) EndTable Table: 3-byte opcode 2 (0x0f 0x3a) diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index 12af572201a2..c94988d5130d 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -800,11 +800,12 @@ cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) cf: vgf2p8mulb Vx,Wx (66) +d8: AESENCWIDE128KL Qpi (F3),(000),(00B) | AESENCWIDE256KL Qpi (F3),(000),(10B) | AESDECWIDE128KL Qpi (F3),(000),(01B) | AESDECWIDE256KL Qpi (F3),(000),(11B) db: VAESIMC Vdq,Wdq (66),(v1) -dc: vaesenc Vx,Hx,Wx (66) -dd: vaesenclast Vx,Hx,Wx (66) -de: vaesdec Vx,Hx,Wx (66) -df: vaesdeclast Vx,Hx,Wx (66) +dc: vaesenc Vx,Hx,Wx (66) | LOADIWKEY Vx,Hx (F3) | AESENC128KL Vpd,Qpi (F3) +dd: vaesenclast Vx,Hx,Wx (66) | AESDEC128KL Vpd,Qpi (F3) +de: vaesdec Vx,Hx,Wx (66) | AESENC256KL Vpd,Qpi (F3) +df: vaesdeclast Vx,Hx,Wx (66) | AESDEC256KL Vpd,Qpi (F3) f0: MOVBE Gy,My | MOVBE Gw,Mw (66) | CRC32 Gd,Eb (F2) | CRC32 Gd,Eb (66&F2) f1: MOVBE My,Gy | MOVBE Mw,Gw (66) | CRC32 Gd,Ey (F2) | CRC32 Gd,Ew (66&F2) f2: ANDN Gy,By,Ey (v) @@ -814,6 +815,8 @@ f6: ADCX Gy,Ey (66) | ADOX Gy,Ey (F3) | MULX By,Gy,rDX,Ey (F2),(v) | WRSSD/Q My, f7: BEXTR Gy,Ey,By (v) | SHLX Gy,Ey,By (66),(v) | SARX Gy,Ey,By (F3),(v) | SHRX Gy,Ey,By (F2),(v) f8: MOVDIR64B Gv,Mdqq (66) | ENQCMD Gv,Mdqq (F2) | ENQCMDS Gv,Mdqq (F3) f9: MOVDIRI My,Gy +fa: ENCODEKEY128 Ew,Ew (F3) +fb: ENCODEKEY256 Ew,Ew (F3) EndTable Table: 3-byte opcode 2 (0x0f 0x3a) -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 02/10] x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter 2024-05-02 10:58 ` [PATCH 01/10] x86/insn: Add Key Locker instructions to the opcode map Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 13:41 ` Arnaldo Carvalho de Melo 2024-05-02 10:58 ` [PATCH 03/10] x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS Adrian Hunter ` (8 subsequent siblings) 10 siblings, 1 reply; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size only i.e. (d64). That was based on Intel SDM Opcode Map. However that is contradicted by the Instruction Set Reference section for PUSH in the same manual. Remove 64-bit operand size only annotation from opcode 0x68 PUSH instruction. Example: $ cat pushw.s .global _start .text _start: pushw $0x1234 mov $0x1,%eax # system call number (sys_exit) int $0x80 $ as -o pushw.o pushw.s $ ld -s -o pushw pushw.o $ objdump -d pushw | tail -4 0000000000401000 <.text>: 401000: 66 68 34 12 pushw $0x1234 401004: b8 01 00 00 00 mov $0x1,%eax 401009: cd 80 int $0x80 $ perf record -e intel_pt//u ./pushw [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data ] Before: $ perf script --insn-trace=disasm Warning: 1 instruction trace errors pushw 10349 [000] 10586.869237014: 401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) pushw $0x1234 pushw 10349 [000] 10586.869237014: 401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %al, (%rax) pushw 10349 [000] 10586.869237014: 401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %cl, %ch pushw 10349 [000] 10586.869237014: 40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb $0x2e, (%rax) instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction After: $ perf script --insn-trace=disasm pushw 10349 [000] 10586.869237014: 401000 [unknown] (./pushw) pushw $0x1234 pushw 10349 [000] 10586.869237014: 401004 [unknown] (./pushw) movl $1, %eax Fixes: eb13296cfaf6 ("x86: Instruction decoder API") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/lib/x86-opcode-map.txt | 2 +- tools/arch/x86/lib/x86-opcode-map.txt | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index c94988d5130d..4ea2e6adb477 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -148,7 +148,7 @@ AVXcode: 65: SEG=GS (Prefix) 66: Operand-Size (Prefix) 67: Address-Size (Prefix) -68: PUSH Iz (d64) +68: PUSH Iz 69: IMUL Gv,Ev,Iz 6a: PUSH Ib (d64) 6b: IMUL Gv,Ev,Ib diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index c94988d5130d..4ea2e6adb477 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -148,7 +148,7 @@ AVXcode: 65: SEG=GS (Prefix) 66: Operand-Size (Prefix) 67: Address-Size (Prefix) -68: PUSH Iz (d64) +68: PUSH Iz 69: IMUL Gv,Ev,Iz 6a: PUSH Ib (d64) 6b: IMUL Gv,Ev,Ib -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 02/10] x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map 2024-05-02 10:58 ` [PATCH 02/10] x86/insn: Fix PUSH instruction in x86 instruction decoder " Adrian Hunter @ 2024-05-02 13:41 ` Arnaldo Carvalho de Melo 2024-05-02 19:11 ` Adrian Hunter 0 siblings, 1 reply; 18+ messages in thread From: Arnaldo Carvalho de Melo @ 2024-05-02 13:41 UTC (permalink / raw) To: Adrian Hunter Cc: linux-kernel, Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users On Thu, May 02, 2024 at 01:58:45PM +0300, Adrian Hunter wrote: > The x86 instruction decoder is used not only for decoding kernel > instructions. It is also used by perf uprobes (user space probes) and by > perf tools Intel Processor Trace decoding. Consequently, it needs to > support instructions executed by user space also. I wonder if we should do it this way, i.e. updating both tools/ and kernel source code in the same patch. I think the best is to update the kernel bits, then, after that is merged, do the tooling part. To avoid possible, yet unlikely, clashes in linux-next, for instance. - Arnaldo > Opcode 0x68 PUSH instruction is currently defined as 64-bit operand size > only i.e. (d64). That was based on Intel SDM Opcode Map. However that is > contradicted by the Instruction Set Reference section for PUSH in the > same manual. > > Remove 64-bit operand size only annotation from opcode 0x68 PUSH > instruction. > > Example: > > $ cat pushw.s > .global _start > .text > _start: > pushw $0x1234 > mov $0x1,%eax # system call number (sys_exit) > int $0x80 > $ as -o pushw.o pushw.s > $ ld -s -o pushw pushw.o > $ objdump -d pushw | tail -4 > 0000000000401000 <.text>: > 401000: 66 68 34 12 pushw $0x1234 > 401004: b8 01 00 00 00 mov $0x1,%eax > 401009: cd 80 int $0x80 > $ perf record -e intel_pt//u ./pushw > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.014 MB perf.data ] > > Before: > > $ perf script --insn-trace=disasm > Warning: > 1 instruction trace errors > pushw 10349 [000] 10586.869237014: 401000 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) pushw $0x1234 > pushw 10349 [000] 10586.869237014: 401006 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %al, (%rax) > pushw 10349 [000] 10586.869237014: 401008 [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb %cl, %ch > pushw 10349 [000] 10586.869237014: 40100a [unknown] (/home/ahunter/git/misc/rtit-tests/pushw) addb $0x2e, (%rax) > instruction trace error type 1 time 10586.869237224 cpu 0 pid 10349 tid 10349 ip 0x40100d code 6: Trace doesn't match instruction > > After: > > $ perf script --insn-trace=disasm > pushw 10349 [000] 10586.869237014: 401000 [unknown] (./pushw) pushw $0x1234 > pushw 10349 [000] 10586.869237014: 401004 [unknown] (./pushw) movl $1, %eax > > Fixes: eb13296cfaf6 ("x86: Instruction decoder API") > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> > --- > arch/x86/lib/x86-opcode-map.txt | 2 +- > tools/arch/x86/lib/x86-opcode-map.txt | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt > index c94988d5130d..4ea2e6adb477 100644 > --- a/arch/x86/lib/x86-opcode-map.txt > +++ b/arch/x86/lib/x86-opcode-map.txt > @@ -148,7 +148,7 @@ AVXcode: > 65: SEG=GS (Prefix) > 66: Operand-Size (Prefix) > 67: Address-Size (Prefix) > -68: PUSH Iz (d64) > +68: PUSH Iz > 69: IMUL Gv,Ev,Iz > 6a: PUSH Ib (d64) > 6b: IMUL Gv,Ev,Ib > diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt > index c94988d5130d..4ea2e6adb477 100644 > --- a/tools/arch/x86/lib/x86-opcode-map.txt > +++ b/tools/arch/x86/lib/x86-opcode-map.txt > @@ -148,7 +148,7 @@ AVXcode: > 65: SEG=GS (Prefix) > 66: Operand-Size (Prefix) > 67: Address-Size (Prefix) > -68: PUSH Iz (d64) > +68: PUSH Iz > 69: IMUL Gv,Ev,Iz > 6a: PUSH Ib (d64) > 6b: IMUL Gv,Ev,Ib > -- > 2.34.1 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 02/10] x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map 2024-05-02 13:41 ` Arnaldo Carvalho de Melo @ 2024-05-02 19:11 ` Adrian Hunter 0 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 19:11 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users On 2/05/24 16:41, Arnaldo Carvalho de Melo wrote: > On Thu, May 02, 2024 at 01:58:45PM +0300, Adrian Hunter wrote: >> The x86 instruction decoder is used not only for decoding kernel >> instructions. It is also used by perf uprobes (user space probes) and by >> perf tools Intel Processor Trace decoding. Consequently, it needs to >> support instructions executed by user space also. > > I wonder if we should do it this way, i.e. updating both tools/ and > kernel source code in the same patch. > > I think the best is to update the kernel bits, then, after that is > merged, do the tooling part. For objtool purposes, it might be better to do both at the same time. > > To avoid possible, yet unlikely, clashes in linux-next, for instance. Always gonna find out about clashes at some point. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 03/10] x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter 2024-05-02 10:58 ` [PATCH 01/10] x86/insn: Add Key Locker instructions to the opcode map Adrian Hunter 2024-05-02 10:58 ` [PATCH 02/10] x86/insn: Fix PUSH instruction in x86 instruction decoder " Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 04/10] x86/insn: Add misc new Intel instructions Adrian Hunter ` (7 subsequent siblings) 10 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Intel Architecture Instruction Set Extensions and Future Features manual number 319433-044 of May 2021, documented VEX versions of instructions VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, but the opcode map has them listed as EVEX only. Remove EVEX-only (ev) annotation from instructions VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS, which allows them to be decoded with either a VEX or EVEX prefix. Fixes: 0153d98f2dd6 ("x86/insn: Add misc instructions to x86 instruction decoder") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/lib/x86-opcode-map.txt | 8 ++++---- tools/arch/x86/lib/x86-opcode-map.txt | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 4ea2e6adb477..24941b9d4e06 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -698,10 +698,10 @@ AVXcode: 2 4d: vrcp14ss/d Vsd,Hpd,Wsd (66),(ev) 4e: vrsqrt14ps/d Vpd,Wpd (66),(ev) 4f: vrsqrt14ss/d Vsd,Hsd,Wsd (66),(ev) -50: vpdpbusd Vx,Hx,Wx (66),(ev) -51: vpdpbusds Vx,Hx,Wx (66),(ev) -52: vdpbf16ps Vx,Hx,Wx (F3),(ev) | vpdpwssd Vx,Hx,Wx (66),(ev) | vp4dpwssd Vdqq,Hdqq,Wdq (F2),(ev) -53: vpdpwssds Vx,Hx,Wx (66),(ev) | vp4dpwssds Vdqq,Hdqq,Wdq (F2),(ev) +50: vpdpbusd Vx,Hx,Wx (66) +51: vpdpbusds Vx,Hx,Wx (66) +52: vdpbf16ps Vx,Hx,Wx (F3),(ev) | vpdpwssd Vx,Hx,Wx (66) | vp4dpwssd Vdqq,Hdqq,Wdq (F2),(ev) +53: vpdpwssds Vx,Hx,Wx (66) | vp4dpwssds Vdqq,Hdqq,Wdq (F2),(ev) 54: vpopcntb/w Vx,Wx (66),(ev) 55: vpopcntd/q Vx,Wx (66),(ev) 58: vpbroadcastd Vx,Wx (66),(v) diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index 4ea2e6adb477..24941b9d4e06 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -698,10 +698,10 @@ AVXcode: 2 4d: vrcp14ss/d Vsd,Hpd,Wsd (66),(ev) 4e: vrsqrt14ps/d Vpd,Wpd (66),(ev) 4f: vrsqrt14ss/d Vsd,Hsd,Wsd (66),(ev) -50: vpdpbusd Vx,Hx,Wx (66),(ev) -51: vpdpbusds Vx,Hx,Wx (66),(ev) -52: vdpbf16ps Vx,Hx,Wx (F3),(ev) | vpdpwssd Vx,Hx,Wx (66),(ev) | vp4dpwssd Vdqq,Hdqq,Wdq (F2),(ev) -53: vpdpwssds Vx,Hx,Wx (66),(ev) | vp4dpwssds Vdqq,Hdqq,Wdq (F2),(ev) +50: vpdpbusd Vx,Hx,Wx (66) +51: vpdpbusds Vx,Hx,Wx (66) +52: vdpbf16ps Vx,Hx,Wx (F3),(ev) | vpdpwssd Vx,Hx,Wx (66) | vp4dpwssd Vdqq,Hdqq,Wdq (F2),(ev) +53: vpdpwssds Vx,Hx,Wx (66) | vp4dpwssds Vdqq,Hdqq,Wdq (F2),(ev) 54: vpopcntb/w Vx,Wx (66),(ev) 55: vpopcntd/q Vx,Wx (66),(ev) 58: vpbroadcastd Vx,Wx (66),(v) -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 04/10] x86/insn: Add misc new Intel instructions 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (2 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 03/10] x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 05/10] x86/insn: Add support for REX2 prefix to the instruction decoder logic Adrian Hunter ` (6 subsequent siblings) 10 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users The x86 instruction decoder is used not only for decoding kernel instructions. It is also used by perf uprobes (user space probes) and by perf tools Intel Processor Trace decoding. Consequently, it needs to support instructions executed by user space also. Add instructions documented in Intel Architecture Instruction Set Extensions and Future Features Programming Reference March 2024 319433-052, that have not been added yet: AADD AAND AOR AXOR CMPccXADD PBNDKB RDMSRLIST URDMSR UWRMSR VBCSTNEBF162PS VBCSTNESH2PS VCVTNEEBF162PS VCVTNEEPH2PS VCVTNEOBF162PS VCVTNEOPH2PS VCVTNEPS2BF16 VPDPB[SU,UU,SS]D[,S] VPDPW[SU,US,UU]D[,S] VPMADD52HUQ VPMADD52LUQ VSHA512MSG1 VSHA512MSG2 VSHA512RNDS2 VSM3MSG1 VSM3MSG2 VSM3RNDS2 VSM4KEY4 VSM4RNDS4 WRMSRLIST TCMMIMFP16PS TCMMRLFP16PS TDPFP16PS PREFETCHIT1 PREFETCHIT0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/lib/x86-opcode-map.txt | 57 +++++++++++++++++++++------ tools/arch/x86/lib/x86-opcode-map.txt | 57 +++++++++++++++++++++------ 2 files changed, 90 insertions(+), 24 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 24941b9d4e06..4e9f53581b58 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -698,8 +698,8 @@ AVXcode: 2 4d: vrcp14ss/d Vsd,Hpd,Wsd (66),(ev) 4e: vrsqrt14ps/d Vpd,Wpd (66),(ev) 4f: vrsqrt14ss/d Vsd,Hsd,Wsd (66),(ev) -50: vpdpbusd Vx,Hx,Wx (66) -51: vpdpbusds Vx,Hx,Wx (66) +50: vpdpbusd Vx,Hx,Wx (66) | vpdpbssd Vx,Hx,Wx (F2),(v) | vpdpbsud Vx,Hx,Wx (F3),(v) | vpdpbuud Vx,Hx,Wx (v) +51: vpdpbusds Vx,Hx,Wx (66) | vpdpbssds Vx,Hx,Wx (F2),(v) | vpdpbsuds Vx,Hx,Wx (F3),(v) | vpdpbuuds Vx,Hx,Wx (v) 52: vdpbf16ps Vx,Hx,Wx (F3),(ev) | vpdpwssd Vx,Hx,Wx (66) | vp4dpwssd Vdqq,Hdqq,Wdq (F2),(ev) 53: vpdpwssds Vx,Hx,Wx (66) | vp4dpwssds Vdqq,Hdqq,Wdq (F2),(ev) 54: vpopcntb/w Vx,Wx (66),(ev) @@ -708,7 +708,7 @@ AVXcode: 2 59: vpbroadcastq Vx,Wx (66),(v) | vbroadcasti32x2 Vx,Wx (66),(evo) 5a: vbroadcasti128 Vqq,Mdq (66),(v) | vbroadcasti32x4/64x2 Vx,Wx (66),(evo) 5b: vbroadcasti32x8/64x4 Vqq,Mdq (66),(ev) -5c: TDPBF16PS Vt,Wt,Ht (F3),(v1) +5c: TDPBF16PS Vt,Wt,Ht (F3),(v1) | TDPFP16PS Vt,Wt,Ht (F2),(v1),(o64) # Skip 0x5d 5e: TDPBSSD Vt,Wt,Ht (F2),(v1) | TDPBSUD Vt,Wt,Ht (F3),(v1) | TDPBUSD Vt,Wt,Ht (66),(v1) | TDPBUUD Vt,Wt,Ht (v1) # Skip 0x5f-0x61 @@ -718,10 +718,12 @@ AVXcode: 2 65: vblendmps/d Vx,Hx,Wx (66),(ev) 66: vpblendmb/w Vx,Hx,Wx (66),(ev) 68: vp2intersectd/q Kx,Hx,Wx (F2),(ev) -# Skip 0x69-0x6f +# Skip 0x69-0x6b +6c: TCMMIMFP16PS Vt,Wt,Ht (66),(v1),(o64) | TCMMRLFP16PS Vt,Wt,Ht (v1),(o64) +# Skip 0x6d-0x6f 70: vpshldvw Vx,Hx,Wx (66),(ev) 71: vpshldvd/q Vx,Hx,Wx (66),(ev) -72: vcvtne2ps2bf16 Vx,Hx,Wx (F2),(ev) | vcvtneps2bf16 Vx,Wx (F3),(ev) | vpshrdvw Vx,Hx,Wx (66),(ev) +72: vcvtne2ps2bf16 Vx,Hx,Wx (F2),(ev) | vcvtneps2bf16 Vx,Wx (F3) | vpshrdvw Vx,Hx,Wx (66),(ev) 73: vpshrdvd/q Vx,Hx,Wx (66),(ev) 75: vpermi2b/w Vx,Hx,Wx (66),(ev) 76: vpermi2d/q Vx,Hx,Wx (66),(ev) @@ -777,8 +779,10 @@ ac: vfnmadd213ps/d Vx,Hx,Wx (66),(v) ad: vfnmadd213ss/d Vx,Hx,Wx (66),(v),(v1) ae: vfnmsub213ps/d Vx,Hx,Wx (66),(v) af: vfnmsub213ss/d Vx,Hx,Wx (66),(v),(v1) -b4: vpmadd52luq Vx,Hx,Wx (66),(ev) -b5: vpmadd52huq Vx,Hx,Wx (66),(ev) +b0: vcvtneebf162ps Vx,Mx (F3),(!11B),(v) | vcvtneeph2ps Vx,Mx (66),(!11B),(v) | vcvtneobf162ps Vx,Mx (F2),(!11B),(v) | vcvtneoph2ps Vx,Mx (!11B),(v) +b1: vbcstnebf162ps Vx,Mw (F3),(!11B),(v) | vbcstnesh2ps Vx,Mw (66),(!11B),(v) +b4: vpmadd52luq Vx,Hx,Wx (66) +b5: vpmadd52huq Vx,Hx,Wx (66) b6: vfmaddsub231ps/d Vx,Hx,Wx (66),(v) b7: vfmsubadd231ps/d Vx,Hx,Wx (66),(v) b8: vfmadd231ps/d Vx,Hx,Wx (66),(v) @@ -796,16 +800,35 @@ c7: Grp19 (1A) c8: sha1nexte Vdq,Wdq | vexp2ps/d Vx,Wx (66),(ev) c9: sha1msg1 Vdq,Wdq ca: sha1msg2 Vdq,Wdq | vrcp28ps/d Vx,Wx (66),(ev) -cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) -cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) -cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) +cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) | vsha512rnds2 Vqq,Hqq,Udq (F2),(11B),(v) +cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) | vsha512msg1 Vqq,Udq (F2),(11B),(v) +cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) | vsha512msg2 Vqq,Uqq (F2),(11B),(v) cf: vgf2p8mulb Vx,Wx (66) +d2: vpdpwsud Vx,Hx,Wx (F3),(v) | vpdpwusd Vx,Hx,Wx (66),(v) | vpdpwuud Vx,Hx,Wx (v) +d3: vpdpwsuds Vx,Hx,Wx (F3),(v) | vpdpwusds Vx,Hx,Wx (66),(v) | vpdpwuuds Vx,Hx,Wx (v) d8: AESENCWIDE128KL Qpi (F3),(000),(00B) | AESENCWIDE256KL Qpi (F3),(000),(10B) | AESDECWIDE128KL Qpi (F3),(000),(01B) | AESDECWIDE256KL Qpi (F3),(000),(11B) +da: vsm3msg1 Vdq,Hdq,Udq (v1) | vsm3msg2 Vdq,Hdq,Udq (66),(v1) | vsm4key4 Vx,Hx,Wx (F3),(v) | vsm4rnds4 Vx,Hx,Wx (F2),(v) db: VAESIMC Vdq,Wdq (66),(v1) dc: vaesenc Vx,Hx,Wx (66) | LOADIWKEY Vx,Hx (F3) | AESENC128KL Vpd,Qpi (F3) dd: vaesenclast Vx,Hx,Wx (66) | AESDEC128KL Vpd,Qpi (F3) de: vaesdec Vx,Hx,Wx (66) | AESENC256KL Vpd,Qpi (F3) df: vaesdeclast Vx,Hx,Wx (66) | AESDEC256KL Vpd,Qpi (F3) +e0: CMPOXADD My,Gy,By (66),(v1),(o64) +e1: CMPNOXADD My,Gy,By (66),(v1),(o64) +e2: CMPBXADD My,Gy,By (66),(v1),(o64) +e3: CMPNBXADD My,Gy,By (66),(v1),(o64) +e4: CMPZXADD My,Gy,By (66),(v1),(o64) +e5: CMPNZXADD My,Gy,By (66),(v1),(o64) +e6: CMPBEXADD My,Gy,By (66),(v1),(o64) +e7: CMPNBEXADD My,Gy,By (66),(v1),(o64) +e8: CMPSXADD My,Gy,By (66),(v1),(o64) +e9: CMPNSXADD My,Gy,By (66),(v1),(o64) +ea: CMPPXADD My,Gy,By (66),(v1),(o64) +eb: CMPNPXADD My,Gy,By (66),(v1),(o64) +ec: CMPLXADD My,Gy,By (66),(v1),(o64) +ed: CMPNLXADD My,Gy,By (66),(v1),(o64) +ee: CMPLEXADD My,Gy,By (66),(v1),(o64) +ef: CMPNLEXADD My,Gy,By (66),(v1),(o64) f0: MOVBE Gy,My | MOVBE Gw,Mw (66) | CRC32 Gd,Eb (F2) | CRC32 Gd,Eb (66&F2) f1: MOVBE My,Gy | MOVBE Mw,Gw (66) | CRC32 Gd,Ey (F2) | CRC32 Gd,Ew (66&F2) f2: ANDN Gy,By,Ey (v) @@ -813,10 +836,11 @@ f3: Grp17 (1A) f5: BZHI Gy,Ey,By (v) | PEXT Gy,By,Ey (F3),(v) | PDEP Gy,By,Ey (F2),(v) | WRUSSD/Q My,Gy (66) f6: ADCX Gy,Ey (66) | ADOX Gy,Ey (F3) | MULX By,Gy,rDX,Ey (F2),(v) | WRSSD/Q My,Gy f7: BEXTR Gy,Ey,By (v) | SHLX Gy,Ey,By (66),(v) | SARX Gy,Ey,By (F3),(v) | SHRX Gy,Ey,By (F2),(v) -f8: MOVDIR64B Gv,Mdqq (66) | ENQCMD Gv,Mdqq (F2) | ENQCMDS Gv,Mdqq (F3) +f8: MOVDIR64B Gv,Mdqq (66) | ENQCMD Gv,Mdqq (F2) | ENQCMDS Gv,Mdqq (F3) | URDMSR Rq,Gq (F2),(11B) | UWRMSR Gq,Rq (F3),(11B) f9: MOVDIRI My,Gy fa: ENCODEKEY128 Ew,Ew (F3) fb: ENCODEKEY256 Ew,Ew (F3) +fc: AADD My,Gy | AAND My,Gy (66) | AOR My,Gy (F2) | AXOR My,Gy (F3) EndTable Table: 3-byte opcode 2 (0x0f 0x3a) @@ -896,6 +920,7 @@ c2: vcmpph Vx,Hx,Wx,Ib (ev) | vcmpsh Vx,Hx,Wx,Ib (F3),(ev) cc: sha1rnds4 Vdq,Wdq,Ib ce: vgf2p8affineqb Vx,Wx,Ib (66) cf: vgf2p8affineinvqb Vx,Wx,Ib (66) +de: vsm3rnds2 Vdq,Hdq,Wdq,Ib (66),(v1) df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1) f0: RORX Gy,Ey,Ib (F2),(v) | HRESET Gv,Ib (F3),(000),(11B) EndTable @@ -978,6 +1003,12 @@ d6: vfcmulcph Vx,Hx,Wx (F2),(ev) | vfmulcph Vx,Hx,Wx (F3),(ev) d7: vfcmulcsh Vx,Hx,Wx (F2),(ev) | vfmulcsh Vx,Hx,Wx (F3),(ev) EndTable +Table: VEX map 7 +Referrer: +AVXcode: 7 +f8: URDMSR Rq,Id (F2),(v1),(11B) | UWRMSR Id,Rq (F3),(v1),(11B) +EndTable + GrpTable: Grp1 0: ADD 1: OR @@ -1054,7 +1085,7 @@ GrpTable: Grp6 EndTable GrpTable: Grp7 -0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) | WRMSRNS (110),(11B) +0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) | WRMSRNS (110),(11B) | RDMSRLIST (F2),(110),(11B) | WRMSRLIST (F3),(110),(11B) | PBNDKB (111),(11B) 1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B) 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B) 3: LIDT Ms @@ -1140,6 +1171,8 @@ GrpTable: Grp16 1: prefetch T0 2: prefetch T1 3: prefetch T2 +6: prefetch IT1 +7: prefetch IT0 EndTable GrpTable: Grp17 diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index 24941b9d4e06..4e9f53581b58 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -698,8 +698,8 @@ AVXcode: 2 4d: vrcp14ss/d Vsd,Hpd,Wsd (66),(ev) 4e: vrsqrt14ps/d Vpd,Wpd (66),(ev) 4f: vrsqrt14ss/d Vsd,Hsd,Wsd (66),(ev) -50: vpdpbusd Vx,Hx,Wx (66) -51: vpdpbusds Vx,Hx,Wx (66) +50: vpdpbusd Vx,Hx,Wx (66) | vpdpbssd Vx,Hx,Wx (F2),(v) | vpdpbsud Vx,Hx,Wx (F3),(v) | vpdpbuud Vx,Hx,Wx (v) +51: vpdpbusds Vx,Hx,Wx (66) | vpdpbssds Vx,Hx,Wx (F2),(v) | vpdpbsuds Vx,Hx,Wx (F3),(v) | vpdpbuuds Vx,Hx,Wx (v) 52: vdpbf16ps Vx,Hx,Wx (F3),(ev) | vpdpwssd Vx,Hx,Wx (66) | vp4dpwssd Vdqq,Hdqq,Wdq (F2),(ev) 53: vpdpwssds Vx,Hx,Wx (66) | vp4dpwssds Vdqq,Hdqq,Wdq (F2),(ev) 54: vpopcntb/w Vx,Wx (66),(ev) @@ -708,7 +708,7 @@ AVXcode: 2 59: vpbroadcastq Vx,Wx (66),(v) | vbroadcasti32x2 Vx,Wx (66),(evo) 5a: vbroadcasti128 Vqq,Mdq (66),(v) | vbroadcasti32x4/64x2 Vx,Wx (66),(evo) 5b: vbroadcasti32x8/64x4 Vqq,Mdq (66),(ev) -5c: TDPBF16PS Vt,Wt,Ht (F3),(v1) +5c: TDPBF16PS Vt,Wt,Ht (F3),(v1) | TDPFP16PS Vt,Wt,Ht (F2),(v1),(o64) # Skip 0x5d 5e: TDPBSSD Vt,Wt,Ht (F2),(v1) | TDPBSUD Vt,Wt,Ht (F3),(v1) | TDPBUSD Vt,Wt,Ht (66),(v1) | TDPBUUD Vt,Wt,Ht (v1) # Skip 0x5f-0x61 @@ -718,10 +718,12 @@ AVXcode: 2 65: vblendmps/d Vx,Hx,Wx (66),(ev) 66: vpblendmb/w Vx,Hx,Wx (66),(ev) 68: vp2intersectd/q Kx,Hx,Wx (F2),(ev) -# Skip 0x69-0x6f +# Skip 0x69-0x6b +6c: TCMMIMFP16PS Vt,Wt,Ht (66),(v1),(o64) | TCMMRLFP16PS Vt,Wt,Ht (v1),(o64) +# Skip 0x6d-0x6f 70: vpshldvw Vx,Hx,Wx (66),(ev) 71: vpshldvd/q Vx,Hx,Wx (66),(ev) -72: vcvtne2ps2bf16 Vx,Hx,Wx (F2),(ev) | vcvtneps2bf16 Vx,Wx (F3),(ev) | vpshrdvw Vx,Hx,Wx (66),(ev) +72: vcvtne2ps2bf16 Vx,Hx,Wx (F2),(ev) | vcvtneps2bf16 Vx,Wx (F3) | vpshrdvw Vx,Hx,Wx (66),(ev) 73: vpshrdvd/q Vx,Hx,Wx (66),(ev) 75: vpermi2b/w Vx,Hx,Wx (66),(ev) 76: vpermi2d/q Vx,Hx,Wx (66),(ev) @@ -777,8 +779,10 @@ ac: vfnmadd213ps/d Vx,Hx,Wx (66),(v) ad: vfnmadd213ss/d Vx,Hx,Wx (66),(v),(v1) ae: vfnmsub213ps/d Vx,Hx,Wx (66),(v) af: vfnmsub213ss/d Vx,Hx,Wx (66),(v),(v1) -b4: vpmadd52luq Vx,Hx,Wx (66),(ev) -b5: vpmadd52huq Vx,Hx,Wx (66),(ev) +b0: vcvtneebf162ps Vx,Mx (F3),(!11B),(v) | vcvtneeph2ps Vx,Mx (66),(!11B),(v) | vcvtneobf162ps Vx,Mx (F2),(!11B),(v) | vcvtneoph2ps Vx,Mx (!11B),(v) +b1: vbcstnebf162ps Vx,Mw (F3),(!11B),(v) | vbcstnesh2ps Vx,Mw (66),(!11B),(v) +b4: vpmadd52luq Vx,Hx,Wx (66) +b5: vpmadd52huq Vx,Hx,Wx (66) b6: vfmaddsub231ps/d Vx,Hx,Wx (66),(v) b7: vfmsubadd231ps/d Vx,Hx,Wx (66),(v) b8: vfmadd231ps/d Vx,Hx,Wx (66),(v) @@ -796,16 +800,35 @@ c7: Grp19 (1A) c8: sha1nexte Vdq,Wdq | vexp2ps/d Vx,Wx (66),(ev) c9: sha1msg1 Vdq,Wdq ca: sha1msg2 Vdq,Wdq | vrcp28ps/d Vx,Wx (66),(ev) -cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) -cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) -cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) +cb: sha256rnds2 Vdq,Wdq | vrcp28ss/d Vx,Hx,Wx (66),(ev) | vsha512rnds2 Vqq,Hqq,Udq (F2),(11B),(v) +cc: sha256msg1 Vdq,Wdq | vrsqrt28ps/d Vx,Wx (66),(ev) | vsha512msg1 Vqq,Udq (F2),(11B),(v) +cd: sha256msg2 Vdq,Wdq | vrsqrt28ss/d Vx,Hx,Wx (66),(ev) | vsha512msg2 Vqq,Uqq (F2),(11B),(v) cf: vgf2p8mulb Vx,Wx (66) +d2: vpdpwsud Vx,Hx,Wx (F3),(v) | vpdpwusd Vx,Hx,Wx (66),(v) | vpdpwuud Vx,Hx,Wx (v) +d3: vpdpwsuds Vx,Hx,Wx (F3),(v) | vpdpwusds Vx,Hx,Wx (66),(v) | vpdpwuuds Vx,Hx,Wx (v) d8: AESENCWIDE128KL Qpi (F3),(000),(00B) | AESENCWIDE256KL Qpi (F3),(000),(10B) | AESDECWIDE128KL Qpi (F3),(000),(01B) | AESDECWIDE256KL Qpi (F3),(000),(11B) +da: vsm3msg1 Vdq,Hdq,Udq (v1) | vsm3msg2 Vdq,Hdq,Udq (66),(v1) | vsm4key4 Vx,Hx,Wx (F3),(v) | vsm4rnds4 Vx,Hx,Wx (F2),(v) db: VAESIMC Vdq,Wdq (66),(v1) dc: vaesenc Vx,Hx,Wx (66) | LOADIWKEY Vx,Hx (F3) | AESENC128KL Vpd,Qpi (F3) dd: vaesenclast Vx,Hx,Wx (66) | AESDEC128KL Vpd,Qpi (F3) de: vaesdec Vx,Hx,Wx (66) | AESENC256KL Vpd,Qpi (F3) df: vaesdeclast Vx,Hx,Wx (66) | AESDEC256KL Vpd,Qpi (F3) +e0: CMPOXADD My,Gy,By (66),(v1),(o64) +e1: CMPNOXADD My,Gy,By (66),(v1),(o64) +e2: CMPBXADD My,Gy,By (66),(v1),(o64) +e3: CMPNBXADD My,Gy,By (66),(v1),(o64) +e4: CMPZXADD My,Gy,By (66),(v1),(o64) +e5: CMPNZXADD My,Gy,By (66),(v1),(o64) +e6: CMPBEXADD My,Gy,By (66),(v1),(o64) +e7: CMPNBEXADD My,Gy,By (66),(v1),(o64) +e8: CMPSXADD My,Gy,By (66),(v1),(o64) +e9: CMPNSXADD My,Gy,By (66),(v1),(o64) +ea: CMPPXADD My,Gy,By (66),(v1),(o64) +eb: CMPNPXADD My,Gy,By (66),(v1),(o64) +ec: CMPLXADD My,Gy,By (66),(v1),(o64) +ed: CMPNLXADD My,Gy,By (66),(v1),(o64) +ee: CMPLEXADD My,Gy,By (66),(v1),(o64) +ef: CMPNLEXADD My,Gy,By (66),(v1),(o64) f0: MOVBE Gy,My | MOVBE Gw,Mw (66) | CRC32 Gd,Eb (F2) | CRC32 Gd,Eb (66&F2) f1: MOVBE My,Gy | MOVBE Mw,Gw (66) | CRC32 Gd,Ey (F2) | CRC32 Gd,Ew (66&F2) f2: ANDN Gy,By,Ey (v) @@ -813,10 +836,11 @@ f3: Grp17 (1A) f5: BZHI Gy,Ey,By (v) | PEXT Gy,By,Ey (F3),(v) | PDEP Gy,By,Ey (F2),(v) | WRUSSD/Q My,Gy (66) f6: ADCX Gy,Ey (66) | ADOX Gy,Ey (F3) | MULX By,Gy,rDX,Ey (F2),(v) | WRSSD/Q My,Gy f7: BEXTR Gy,Ey,By (v) | SHLX Gy,Ey,By (66),(v) | SARX Gy,Ey,By (F3),(v) | SHRX Gy,Ey,By (F2),(v) -f8: MOVDIR64B Gv,Mdqq (66) | ENQCMD Gv,Mdqq (F2) | ENQCMDS Gv,Mdqq (F3) +f8: MOVDIR64B Gv,Mdqq (66) | ENQCMD Gv,Mdqq (F2) | ENQCMDS Gv,Mdqq (F3) | URDMSR Rq,Gq (F2),(11B) | UWRMSR Gq,Rq (F3),(11B) f9: MOVDIRI My,Gy fa: ENCODEKEY128 Ew,Ew (F3) fb: ENCODEKEY256 Ew,Ew (F3) +fc: AADD My,Gy | AAND My,Gy (66) | AOR My,Gy (F2) | AXOR My,Gy (F3) EndTable Table: 3-byte opcode 2 (0x0f 0x3a) @@ -896,6 +920,7 @@ c2: vcmpph Vx,Hx,Wx,Ib (ev) | vcmpsh Vx,Hx,Wx,Ib (F3),(ev) cc: sha1rnds4 Vdq,Wdq,Ib ce: vgf2p8affineqb Vx,Wx,Ib (66) cf: vgf2p8affineinvqb Vx,Wx,Ib (66) +de: vsm3rnds2 Vdq,Hdq,Wdq,Ib (66),(v1) df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1) f0: RORX Gy,Ey,Ib (F2),(v) | HRESET Gv,Ib (F3),(000),(11B) EndTable @@ -978,6 +1003,12 @@ d6: vfcmulcph Vx,Hx,Wx (F2),(ev) | vfmulcph Vx,Hx,Wx (F3),(ev) d7: vfcmulcsh Vx,Hx,Wx (F2),(ev) | vfmulcsh Vx,Hx,Wx (F3),(ev) EndTable +Table: VEX map 7 +Referrer: +AVXcode: 7 +f8: URDMSR Rq,Id (F2),(v1),(11B) | UWRMSR Id,Rq (F3),(v1),(11B) +EndTable + GrpTable: Grp1 0: ADD 1: OR @@ -1054,7 +1085,7 @@ GrpTable: Grp6 EndTable GrpTable: Grp7 -0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) | WRMSRNS (110),(11B) +0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B) | PCONFIG (101),(11B) | ENCLV (000),(11B) | WRMSRNS (110),(11B) | RDMSRLIST (F2),(110),(11B) | WRMSRLIST (F3),(110),(11B) | PBNDKB (111),(11B) 1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001),(11B) | CLAC (010),(11B) | STAC (011),(11B) | ENCLS (111),(11B) | ERETU (F3),(010),(11B) | ERETS (F2),(010),(11B) 2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B) | VMFUNC (100),(11B) | XEND (101)(11B) | XTEST (110)(11B) | ENCLU (111),(11B) 3: LIDT Ms @@ -1140,6 +1171,8 @@ GrpTable: Grp16 1: prefetch T0 2: prefetch T1 3: prefetch T2 +6: prefetch IT1 +7: prefetch IT0 EndTable GrpTable: Grp17 -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 05/10] x86/insn: Add support for REX2 prefix to the instruction decoder logic 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (3 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 04/10] x86/insn: Add misc new Intel instructions Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 18:10 ` Ian Rogers 2024-05-02 10:58 ` [PATCH 06/10] x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map Adrian Hunter ` (5 subsequent siblings) 10 siblings, 1 reply; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31. The REX2 prefix is effectively an extended version of the REX prefix. REX2 and EVEX are also used with PUSH/POP instructions to provide a Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to fast-forward register data between matching PUSH and POP instructions. REX2 is valid only with opcodes in maps 0 and 1. Similar extension for other maps is provided by the EVEX prefix, covered in a separate patch. Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used for a new 64-bit absolute direct jump instruction JMPABS. Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify opcodes (only JMPABS) that require a mandatory REX2 prefix (INAT_REX2_VARIANT). Amend logic to read the REX2 prefix and get the opcode attribute for the map number (0 or 1) encoded in the REX2 prefix. Amend the awk script that generates the attribute tables from the opcode map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)" as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/include/asm/inat.h | 11 +++++++++- arch/x86/include/asm/insn.h | 25 ++++++++++++++++++---- arch/x86/lib/insn.c | 25 ++++++++++++++++++++++ arch/x86/tools/gen-insn-attr-x86.awk | 11 +++++++++- tools/arch/x86/include/asm/inat.h | 11 +++++++++- tools/arch/x86/include/asm/insn.h | 25 ++++++++++++++++++---- tools/arch/x86/lib/insn.c | 25 ++++++++++++++++++++++ tools/arch/x86/tools/gen-insn-attr-x86.awk | 11 +++++++++- 8 files changed, 132 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h index b56c5741581a..1331bdd39a23 100644 --- a/arch/x86/include/asm/inat.h +++ b/arch/x86/include/asm/inat.h @@ -35,6 +35,8 @@ #define INAT_PFX_VEX2 13 /* 2-bytes VEX prefix */ #define INAT_PFX_VEX3 14 /* 3-bytes VEX prefix */ #define INAT_PFX_EVEX 15 /* EVEX prefix */ +/* x86-64 REX2 prefix */ +#define INAT_PFX_REX2 16 /* 0xD5 */ #define INAT_LSTPFX_MAX 3 #define INAT_LGCPFX_MAX 11 @@ -50,7 +52,7 @@ /* Legacy prefix */ #define INAT_PFX_OFFS 0 -#define INAT_PFX_BITS 4 +#define INAT_PFX_BITS 5 #define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1) #define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS) /* Escape opcodes */ @@ -77,6 +79,8 @@ #define INAT_VEXOK (1 << (INAT_FLAG_OFFS + 5)) #define INAT_VEXONLY (1 << (INAT_FLAG_OFFS + 6)) #define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) +#define INAT_NO_REX2 (1 << (INAT_FLAG_OFFS + 8)) +#define INAT_REX2_VARIANT (1 << (INAT_FLAG_OFFS + 9)) /* Attribute making macros for attribute tables */ #define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) #define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) @@ -128,6 +132,11 @@ static inline int inat_is_rex_prefix(insn_attr_t attr) return (attr & INAT_PFX_MASK) == INAT_PFX_REX; } +static inline int inat_is_rex2_prefix(insn_attr_t attr) +{ + return (attr & INAT_PFX_MASK) == INAT_PFX_REX2; +} + static inline int inat_last_prefix_id(insn_attr_t attr) { if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX) diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h index 1b29f58f730f..95249ec1f24e 100644 --- a/arch/x86/include/asm/insn.h +++ b/arch/x86/include/asm/insn.h @@ -112,10 +112,15 @@ struct insn { #define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) #define X86_SIB_BASE(sib) ((sib) & 0x07) -#define X86_REX_W(rex) ((rex) & 8) -#define X86_REX_R(rex) ((rex) & 4) -#define X86_REX_X(rex) ((rex) & 2) -#define X86_REX_B(rex) ((rex) & 1) +#define X86_REX2_M(rex) ((rex) & 0x80) /* REX2 M0 */ +#define X86_REX2_R(rex) ((rex) & 0x40) /* REX2 R4 */ +#define X86_REX2_X(rex) ((rex) & 0x20) /* REX2 X4 */ +#define X86_REX2_B(rex) ((rex) & 0x10) /* REX2 B4 */ + +#define X86_REX_W(rex) ((rex) & 8) /* REX or REX2 W */ +#define X86_REX_R(rex) ((rex) & 4) /* REX or REX2 R3 */ +#define X86_REX_X(rex) ((rex) & 2) /* REX or REX2 X3 */ +#define X86_REX_B(rex) ((rex) & 1) /* REX or REX2 B3 */ /* VEX bit flags */ #define X86_VEX_W(vex) ((vex) & 0x80) /* VEX3 Byte2 */ @@ -161,6 +166,18 @@ static inline void insn_get_attribute(struct insn *insn) /* Instruction uses RIP-relative addressing */ extern int insn_rip_relative(struct insn *insn); +static inline int insn_is_rex2(struct insn *insn) +{ + if (!insn->prefixes.got) + insn_get_prefixes(insn); + return insn->rex_prefix.nbytes == 2; +} + +static inline insn_byte_t insn_rex2_m_bit(struct insn *insn) +{ + return X86_REX2_M(insn->rex_prefix.bytes[1]); +} + static inline int insn_is_avx(struct insn *insn) { if (!insn->prefixes.got) diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c index 1bb155a0955b..6126ddc6e5f5 100644 --- a/arch/x86/lib/insn.c +++ b/arch/x86/lib/insn.c @@ -185,6 +185,17 @@ int insn_get_prefixes(struct insn *insn) if (X86_REX_W(b)) /* REX.W overrides opnd_size */ insn->opnd_bytes = 8; + } else if (inat_is_rex2_prefix(attr)) { + insn_set_byte(&insn->rex_prefix, 0, b); + b = peek_nbyte_next(insn_byte_t, insn, 1); + insn_set_byte(&insn->rex_prefix, 1, b); + insn->rex_prefix.nbytes = 2; + insn->next_byte += 2; + if (X86_REX_W(b)) + /* REX.W overrides opnd_size */ + insn->opnd_bytes = 8; + insn->rex_prefix.got = 1; + goto vex_end; } } insn->rex_prefix.got = 1; @@ -294,6 +305,20 @@ int insn_get_opcode(struct insn *insn) goto end; } + /* Check if there is REX2 prefix or not */ + if (insn_is_rex2(insn)) { + if (insn_rex2_m_bit(insn)) { + /* map 1 is escape 0x0f */ + insn_attr_t esc_attr = inat_get_opcode_attribute(0x0f); + + pfx_id = insn_last_prefix_id(insn); + insn->attr = inat_get_escape_attribute(op, pfx_id, esc_attr); + } else { + insn->attr = inat_get_opcode_attribute(op); + } + goto end; + } + insn->attr = inat_get_opcode_attribute(op); while (inat_is_escape(insn->attr)) { /* Get escaped opcode */ diff --git a/arch/x86/tools/gen-insn-attr-x86.awk b/arch/x86/tools/gen-insn-attr-x86.awk index af38469afd14..3f43aa7d8fef 100644 --- a/arch/x86/tools/gen-insn-attr-x86.awk +++ b/arch/x86/tools/gen-insn-attr-x86.awk @@ -64,7 +64,9 @@ BEGIN { modrm_expr = "^([CDEGMNPQRSUVW/][a-z]+|NTA|T[012])" force64_expr = "\\([df]64\\)" - rex_expr = "^REX(\\.[XRWB]+)*" + rex_expr = "^((REX(\\.[XRWB]+)+)|(REX$))" + rex2_expr = "\\(REX2\\)" + no_rex2_expr = "\\(!REX2\\)" fpu_expr = "^ESC" # TODO lprefix1_expr = "\\((66|!F3)\\)" @@ -99,6 +101,7 @@ BEGIN { prefix_num["VEX+1byte"] = "INAT_PFX_VEX2" prefix_num["VEX+2byte"] = "INAT_PFX_VEX3" prefix_num["EVEX"] = "INAT_PFX_EVEX" + prefix_num["REX2"] = "INAT_PFX_REX2" clear_vars() } @@ -314,6 +317,10 @@ function convert_operands(count,opnd, i,j,imm,mod) if (match(ext, force64_expr)) flags = add_flags(flags, "INAT_FORCE64") + # check REX2 not allowed + if (match(ext, no_rex2_expr)) + flags = add_flags(flags, "INAT_NO_REX2") + # check REX prefix if (match(opcode, rex_expr)) flags = add_flags(flags, "INAT_MAKE_PREFIX(INAT_PFX_REX)") @@ -351,6 +358,8 @@ function convert_operands(count,opnd, i,j,imm,mod) lptable3[idx] = add_flags(lptable3[idx],flags) variant = "INAT_VARIANT" } + if (match(ext, rex2_expr)) + table[idx] = add_flags(table[idx], "INAT_REX2_VARIANT") if (!match(ext, lprefix_expr)){ table[idx] = add_flags(table[idx],flags) } diff --git a/tools/arch/x86/include/asm/inat.h b/tools/arch/x86/include/asm/inat.h index a61051400311..2e65312cae52 100644 --- a/tools/arch/x86/include/asm/inat.h +++ b/tools/arch/x86/include/asm/inat.h @@ -35,6 +35,8 @@ #define INAT_PFX_VEX2 13 /* 2-bytes VEX prefix */ #define INAT_PFX_VEX3 14 /* 3-bytes VEX prefix */ #define INAT_PFX_EVEX 15 /* EVEX prefix */ +/* x86-64 REX2 prefix */ +#define INAT_PFX_REX2 16 /* 0xD5 */ #define INAT_LSTPFX_MAX 3 #define INAT_LGCPFX_MAX 11 @@ -50,7 +52,7 @@ /* Legacy prefix */ #define INAT_PFX_OFFS 0 -#define INAT_PFX_BITS 4 +#define INAT_PFX_BITS 5 #define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1) #define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS) /* Escape opcodes */ @@ -77,6 +79,8 @@ #define INAT_VEXOK (1 << (INAT_FLAG_OFFS + 5)) #define INAT_VEXONLY (1 << (INAT_FLAG_OFFS + 6)) #define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) +#define INAT_NO_REX2 (1 << (INAT_FLAG_OFFS + 8)) +#define INAT_REX2_VARIANT (1 << (INAT_FLAG_OFFS + 9)) /* Attribute making macros for attribute tables */ #define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) #define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) @@ -128,6 +132,11 @@ static inline int inat_is_rex_prefix(insn_attr_t attr) return (attr & INAT_PFX_MASK) == INAT_PFX_REX; } +static inline int inat_is_rex2_prefix(insn_attr_t attr) +{ + return (attr & INAT_PFX_MASK) == INAT_PFX_REX2; +} + static inline int inat_last_prefix_id(insn_attr_t attr) { if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX) diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h index 65c0d9ce1e29..1a7e8fc4d75a 100644 --- a/tools/arch/x86/include/asm/insn.h +++ b/tools/arch/x86/include/asm/insn.h @@ -112,10 +112,15 @@ struct insn { #define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) #define X86_SIB_BASE(sib) ((sib) & 0x07) -#define X86_REX_W(rex) ((rex) & 8) -#define X86_REX_R(rex) ((rex) & 4) -#define X86_REX_X(rex) ((rex) & 2) -#define X86_REX_B(rex) ((rex) & 1) +#define X86_REX2_M(rex) ((rex) & 0x80) /* REX2 M0 */ +#define X86_REX2_R(rex) ((rex) & 0x40) /* REX2 R4 */ +#define X86_REX2_X(rex) ((rex) & 0x20) /* REX2 X4 */ +#define X86_REX2_B(rex) ((rex) & 0x10) /* REX2 B4 */ + +#define X86_REX_W(rex) ((rex) & 8) /* REX or REX2 W */ +#define X86_REX_R(rex) ((rex) & 4) /* REX or REX2 R3 */ +#define X86_REX_X(rex) ((rex) & 2) /* REX or REX2 X3 */ +#define X86_REX_B(rex) ((rex) & 1) /* REX or REX2 B3 */ /* VEX bit flags */ #define X86_VEX_W(vex) ((vex) & 0x80) /* VEX3 Byte2 */ @@ -161,6 +166,18 @@ static inline void insn_get_attribute(struct insn *insn) /* Instruction uses RIP-relative addressing */ extern int insn_rip_relative(struct insn *insn); +static inline int insn_is_rex2(struct insn *insn) +{ + if (!insn->prefixes.got) + insn_get_prefixes(insn); + return insn->rex_prefix.nbytes == 2; +} + +static inline insn_byte_t insn_rex2_m_bit(struct insn *insn) +{ + return X86_REX2_M(insn->rex_prefix.bytes[1]); +} + static inline int insn_is_avx(struct insn *insn) { if (!insn->prefixes.got) diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c index ada4b4a79dd4..f761adeb8e8c 100644 --- a/tools/arch/x86/lib/insn.c +++ b/tools/arch/x86/lib/insn.c @@ -185,6 +185,17 @@ int insn_get_prefixes(struct insn *insn) if (X86_REX_W(b)) /* REX.W overrides opnd_size */ insn->opnd_bytes = 8; + } else if (inat_is_rex2_prefix(attr)) { + insn_set_byte(&insn->rex_prefix, 0, b); + b = peek_nbyte_next(insn_byte_t, insn, 1); + insn_set_byte(&insn->rex_prefix, 1, b); + insn->rex_prefix.nbytes = 2; + insn->next_byte += 2; + if (X86_REX_W(b)) + /* REX.W overrides opnd_size */ + insn->opnd_bytes = 8; + insn->rex_prefix.got = 1; + goto vex_end; } } insn->rex_prefix.got = 1; @@ -294,6 +305,20 @@ int insn_get_opcode(struct insn *insn) goto end; } + /* Check if there is REX2 prefix or not */ + if (insn_is_rex2(insn)) { + if (insn_rex2_m_bit(insn)) { + /* map 1 is escape 0x0f */ + insn_attr_t esc_attr = inat_get_opcode_attribute(0x0f); + + pfx_id = insn_last_prefix_id(insn); + insn->attr = inat_get_escape_attribute(op, pfx_id, esc_attr); + } else { + insn->attr = inat_get_opcode_attribute(op); + } + goto end; + } + insn->attr = inat_get_opcode_attribute(op); while (inat_is_escape(insn->attr)) { /* Get escaped opcode */ diff --git a/tools/arch/x86/tools/gen-insn-attr-x86.awk b/tools/arch/x86/tools/gen-insn-attr-x86.awk index af38469afd14..3f43aa7d8fef 100644 --- a/tools/arch/x86/tools/gen-insn-attr-x86.awk +++ b/tools/arch/x86/tools/gen-insn-attr-x86.awk @@ -64,7 +64,9 @@ BEGIN { modrm_expr = "^([CDEGMNPQRSUVW/][a-z]+|NTA|T[012])" force64_expr = "\\([df]64\\)" - rex_expr = "^REX(\\.[XRWB]+)*" + rex_expr = "^((REX(\\.[XRWB]+)+)|(REX$))" + rex2_expr = "\\(REX2\\)" + no_rex2_expr = "\\(!REX2\\)" fpu_expr = "^ESC" # TODO lprefix1_expr = "\\((66|!F3)\\)" @@ -99,6 +101,7 @@ BEGIN { prefix_num["VEX+1byte"] = "INAT_PFX_VEX2" prefix_num["VEX+2byte"] = "INAT_PFX_VEX3" prefix_num["EVEX"] = "INAT_PFX_EVEX" + prefix_num["REX2"] = "INAT_PFX_REX2" clear_vars() } @@ -314,6 +317,10 @@ function convert_operands(count,opnd, i,j,imm,mod) if (match(ext, force64_expr)) flags = add_flags(flags, "INAT_FORCE64") + # check REX2 not allowed + if (match(ext, no_rex2_expr)) + flags = add_flags(flags, "INAT_NO_REX2") + # check REX prefix if (match(opcode, rex_expr)) flags = add_flags(flags, "INAT_MAKE_PREFIX(INAT_PFX_REX)") @@ -351,6 +358,8 @@ function convert_operands(count,opnd, i,j,imm,mod) lptable3[idx] = add_flags(lptable3[idx],flags) variant = "INAT_VARIANT" } + if (match(ext, rex2_expr)) + table[idx] = add_flags(table[idx], "INAT_REX2_VARIANT") if (!match(ext, lprefix_expr)){ table[idx] = add_flags(table[idx],flags) } -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 05/10] x86/insn: Add support for REX2 prefix to the instruction decoder logic 2024-05-02 10:58 ` [PATCH 05/10] x86/insn: Add support for REX2 prefix to the instruction decoder logic Adrian Hunter @ 2024-05-02 18:10 ` Ian Rogers 2024-05-03 5:09 ` Adrian Hunter 0 siblings, 1 reply; 18+ messages in thread From: Ian Rogers @ 2024-05-02 18:10 UTC (permalink / raw) To: Adrian Hunter Cc: linux-kernel, Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, linux-perf-users On Thu, May 2, 2024 at 3:59 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named > REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31. > > The REX2 prefix is effectively an extended version of the REX prefix. > > REX2 and EVEX are also used with PUSH/POP instructions to provide a > Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to > fast-forward register data between matching PUSH and POP instructions. > > REX2 is valid only with opcodes in maps 0 and 1. Similar extension for > other maps is provided by the EVEX prefix, covered in a separate patch. > > Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used > for a new 64-bit absolute direct jump instruction JMPABS. > > Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture > Specification for details. > > Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute > flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify > opcodes (only JMPABS) that require a mandatory REX2 prefix > (INAT_REX2_VARIANT). > > Amend logic to read the REX2 prefix and get the opcode attribute for the > map number (0 or 1) encoded in the REX2 prefix. > > Amend the awk script that generates the attribute tables from the opcode > map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)" > as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT. > > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> > --- > arch/x86/include/asm/inat.h | 11 +++++++++- > arch/x86/include/asm/insn.h | 25 ++++++++++++++++++---- > arch/x86/lib/insn.c | 25 ++++++++++++++++++++++ > arch/x86/tools/gen-insn-attr-x86.awk | 11 +++++++++- > tools/arch/x86/include/asm/inat.h | 11 +++++++++- > tools/arch/x86/include/asm/insn.h | 25 ++++++++++++++++++---- > tools/arch/x86/lib/insn.c | 25 ++++++++++++++++++++++ > tools/arch/x86/tools/gen-insn-attr-x86.awk | 11 +++++++++- > 8 files changed, 132 insertions(+), 12 deletions(-) > > diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h > index b56c5741581a..1331bdd39a23 100644 > --- a/arch/x86/include/asm/inat.h > +++ b/arch/x86/include/asm/inat.h > @@ -35,6 +35,8 @@ > #define INAT_PFX_VEX2 13 /* 2-bytes VEX prefix */ > #define INAT_PFX_VEX3 14 /* 3-bytes VEX prefix */ > #define INAT_PFX_EVEX 15 /* EVEX prefix */ > +/* x86-64 REX2 prefix */ > +#define INAT_PFX_REX2 16 /* 0xD5 */ > > #define INAT_LSTPFX_MAX 3 > #define INAT_LGCPFX_MAX 11 > @@ -50,7 +52,7 @@ > > /* Legacy prefix */ > #define INAT_PFX_OFFS 0 > -#define INAT_PFX_BITS 4 > +#define INAT_PFX_BITS 5 > #define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1) > #define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS) > /* Escape opcodes */ > @@ -77,6 +79,8 @@ > #define INAT_VEXOK (1 << (INAT_FLAG_OFFS + 5)) > #define INAT_VEXONLY (1 << (INAT_FLAG_OFFS + 6)) > #define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) > +#define INAT_NO_REX2 (1 << (INAT_FLAG_OFFS + 8)) > +#define INAT_REX2_VARIANT (1 << (INAT_FLAG_OFFS + 9)) > /* Attribute making macros for attribute tables */ > #define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) > #define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) > @@ -128,6 +132,11 @@ static inline int inat_is_rex_prefix(insn_attr_t attr) > return (attr & INAT_PFX_MASK) == INAT_PFX_REX; > } > > +static inline int inat_is_rex2_prefix(insn_attr_t attr) > +{ > + return (attr & INAT_PFX_MASK) == INAT_PFX_REX2; > +} > + > static inline int inat_last_prefix_id(insn_attr_t attr) > { > if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX) > diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h > index 1b29f58f730f..95249ec1f24e 100644 > --- a/arch/x86/include/asm/insn.h > +++ b/arch/x86/include/asm/insn.h > @@ -112,10 +112,15 @@ struct insn { > #define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) > #define X86_SIB_BASE(sib) ((sib) & 0x07) > > -#define X86_REX_W(rex) ((rex) & 8) > -#define X86_REX_R(rex) ((rex) & 4) > -#define X86_REX_X(rex) ((rex) & 2) > -#define X86_REX_B(rex) ((rex) & 1) > +#define X86_REX2_M(rex) ((rex) & 0x80) /* REX2 M0 */ > +#define X86_REX2_R(rex) ((rex) & 0x40) /* REX2 R4 */ > +#define X86_REX2_X(rex) ((rex) & 0x20) /* REX2 X4 */ > +#define X86_REX2_B(rex) ((rex) & 0x10) /* REX2 B4 */ > + > +#define X86_REX_W(rex) ((rex) & 8) /* REX or REX2 W */ > +#define X86_REX_R(rex) ((rex) & 4) /* REX or REX2 R3 */ > +#define X86_REX_X(rex) ((rex) & 2) /* REX or REX2 X3 */ > +#define X86_REX_B(rex) ((rex) & 1) /* REX or REX2 B3 */ > > /* VEX bit flags */ > #define X86_VEX_W(vex) ((vex) & 0x80) /* VEX3 Byte2 */ > @@ -161,6 +166,18 @@ static inline void insn_get_attribute(struct insn *insn) > /* Instruction uses RIP-relative addressing */ > extern int insn_rip_relative(struct insn *insn); > > +static inline int insn_is_rex2(struct insn *insn) > +{ > + if (!insn->prefixes.got) > + insn_get_prefixes(insn); > + return insn->rex_prefix.nbytes == 2; It'd be nice to capture that a rex2 prefix is by definition 2 bytes. Playing devil's advocate, if there were a REX and a REX2 prefix, couldn't rex_prefix.nbytes be 3? I'm wondering about other prefix combinations that may confuse this logic, maybe someone dreams up doing this for say alignment reasons like "rep ret". Thanks, Ian > +} > + > +static inline insn_byte_t insn_rex2_m_bit(struct insn *insn) > +{ > + return X86_REX2_M(insn->rex_prefix.bytes[1]); > +} > + > static inline int insn_is_avx(struct insn *insn) > { > if (!insn->prefixes.got) > diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c > index 1bb155a0955b..6126ddc6e5f5 100644 > --- a/arch/x86/lib/insn.c > +++ b/arch/x86/lib/insn.c > @@ -185,6 +185,17 @@ int insn_get_prefixes(struct insn *insn) > if (X86_REX_W(b)) > /* REX.W overrides opnd_size */ > insn->opnd_bytes = 8; > + } else if (inat_is_rex2_prefix(attr)) { > + insn_set_byte(&insn->rex_prefix, 0, b); > + b = peek_nbyte_next(insn_byte_t, insn, 1); > + insn_set_byte(&insn->rex_prefix, 1, b); > + insn->rex_prefix.nbytes = 2; > + insn->next_byte += 2; > + if (X86_REX_W(b)) > + /* REX.W overrides opnd_size */ > + insn->opnd_bytes = 8; > + insn->rex_prefix.got = 1; > + goto vex_end; > } > } > insn->rex_prefix.got = 1; > @@ -294,6 +305,20 @@ int insn_get_opcode(struct insn *insn) > goto end; > } > > + /* Check if there is REX2 prefix or not */ > + if (insn_is_rex2(insn)) { > + if (insn_rex2_m_bit(insn)) { > + /* map 1 is escape 0x0f */ > + insn_attr_t esc_attr = inat_get_opcode_attribute(0x0f); > + > + pfx_id = insn_last_prefix_id(insn); > + insn->attr = inat_get_escape_attribute(op, pfx_id, esc_attr); > + } else { > + insn->attr = inat_get_opcode_attribute(op); > + } > + goto end; > + } > + > insn->attr = inat_get_opcode_attribute(op); > while (inat_is_escape(insn->attr)) { > /* Get escaped opcode */ > diff --git a/arch/x86/tools/gen-insn-attr-x86.awk b/arch/x86/tools/gen-insn-attr-x86.awk > index af38469afd14..3f43aa7d8fef 100644 > --- a/arch/x86/tools/gen-insn-attr-x86.awk > +++ b/arch/x86/tools/gen-insn-attr-x86.awk > @@ -64,7 +64,9 @@ BEGIN { > > modrm_expr = "^([CDEGMNPQRSUVW/][a-z]+|NTA|T[012])" > force64_expr = "\\([df]64\\)" > - rex_expr = "^REX(\\.[XRWB]+)*" > + rex_expr = "^((REX(\\.[XRWB]+)+)|(REX$))" > + rex2_expr = "\\(REX2\\)" > + no_rex2_expr = "\\(!REX2\\)" > fpu_expr = "^ESC" # TODO > > lprefix1_expr = "\\((66|!F3)\\)" > @@ -99,6 +101,7 @@ BEGIN { > prefix_num["VEX+1byte"] = "INAT_PFX_VEX2" > prefix_num["VEX+2byte"] = "INAT_PFX_VEX3" > prefix_num["EVEX"] = "INAT_PFX_EVEX" > + prefix_num["REX2"] = "INAT_PFX_REX2" > > clear_vars() > } > @@ -314,6 +317,10 @@ function convert_operands(count,opnd, i,j,imm,mod) > if (match(ext, force64_expr)) > flags = add_flags(flags, "INAT_FORCE64") > > + # check REX2 not allowed > + if (match(ext, no_rex2_expr)) > + flags = add_flags(flags, "INAT_NO_REX2") > + > # check REX prefix > if (match(opcode, rex_expr)) > flags = add_flags(flags, "INAT_MAKE_PREFIX(INAT_PFX_REX)") > @@ -351,6 +358,8 @@ function convert_operands(count,opnd, i,j,imm,mod) > lptable3[idx] = add_flags(lptable3[idx],flags) > variant = "INAT_VARIANT" > } > + if (match(ext, rex2_expr)) > + table[idx] = add_flags(table[idx], "INAT_REX2_VARIANT") > if (!match(ext, lprefix_expr)){ > table[idx] = add_flags(table[idx],flags) > } > diff --git a/tools/arch/x86/include/asm/inat.h b/tools/arch/x86/include/asm/inat.h > index a61051400311..2e65312cae52 100644 > --- a/tools/arch/x86/include/asm/inat.h > +++ b/tools/arch/x86/include/asm/inat.h > @@ -35,6 +35,8 @@ > #define INAT_PFX_VEX2 13 /* 2-bytes VEX prefix */ > #define INAT_PFX_VEX3 14 /* 3-bytes VEX prefix */ > #define INAT_PFX_EVEX 15 /* EVEX prefix */ > +/* x86-64 REX2 prefix */ > +#define INAT_PFX_REX2 16 /* 0xD5 */ > > #define INAT_LSTPFX_MAX 3 > #define INAT_LGCPFX_MAX 11 > @@ -50,7 +52,7 @@ > > /* Legacy prefix */ > #define INAT_PFX_OFFS 0 > -#define INAT_PFX_BITS 4 > +#define INAT_PFX_BITS 5 > #define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1) > #define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS) > /* Escape opcodes */ > @@ -77,6 +79,8 @@ > #define INAT_VEXOK (1 << (INAT_FLAG_OFFS + 5)) > #define INAT_VEXONLY (1 << (INAT_FLAG_OFFS + 6)) > #define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) > +#define INAT_NO_REX2 (1 << (INAT_FLAG_OFFS + 8)) > +#define INAT_REX2_VARIANT (1 << (INAT_FLAG_OFFS + 9)) > /* Attribute making macros for attribute tables */ > #define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) > #define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) > @@ -128,6 +132,11 @@ static inline int inat_is_rex_prefix(insn_attr_t attr) > return (attr & INAT_PFX_MASK) == INAT_PFX_REX; > } > > +static inline int inat_is_rex2_prefix(insn_attr_t attr) > +{ > + return (attr & INAT_PFX_MASK) == INAT_PFX_REX2; > +} > + > static inline int inat_last_prefix_id(insn_attr_t attr) > { > if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX) > diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h > index 65c0d9ce1e29..1a7e8fc4d75a 100644 > --- a/tools/arch/x86/include/asm/insn.h > +++ b/tools/arch/x86/include/asm/insn.h > @@ -112,10 +112,15 @@ struct insn { > #define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) > #define X86_SIB_BASE(sib) ((sib) & 0x07) > > -#define X86_REX_W(rex) ((rex) & 8) > -#define X86_REX_R(rex) ((rex) & 4) > -#define X86_REX_X(rex) ((rex) & 2) > -#define X86_REX_B(rex) ((rex) & 1) > +#define X86_REX2_M(rex) ((rex) & 0x80) /* REX2 M0 */ > +#define X86_REX2_R(rex) ((rex) & 0x40) /* REX2 R4 */ > +#define X86_REX2_X(rex) ((rex) & 0x20) /* REX2 X4 */ > +#define X86_REX2_B(rex) ((rex) & 0x10) /* REX2 B4 */ > + > +#define X86_REX_W(rex) ((rex) & 8) /* REX or REX2 W */ > +#define X86_REX_R(rex) ((rex) & 4) /* REX or REX2 R3 */ > +#define X86_REX_X(rex) ((rex) & 2) /* REX or REX2 X3 */ > +#define X86_REX_B(rex) ((rex) & 1) /* REX or REX2 B3 */ > > /* VEX bit flags */ > #define X86_VEX_W(vex) ((vex) & 0x80) /* VEX3 Byte2 */ > @@ -161,6 +166,18 @@ static inline void insn_get_attribute(struct insn *insn) > /* Instruction uses RIP-relative addressing */ > extern int insn_rip_relative(struct insn *insn); > > +static inline int insn_is_rex2(struct insn *insn) > +{ > + if (!insn->prefixes.got) > + insn_get_prefixes(insn); > + return insn->rex_prefix.nbytes == 2; > +} > + > +static inline insn_byte_t insn_rex2_m_bit(struct insn *insn) > +{ > + return X86_REX2_M(insn->rex_prefix.bytes[1]); > +} > + > static inline int insn_is_avx(struct insn *insn) > { > if (!insn->prefixes.got) > diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c > index ada4b4a79dd4..f761adeb8e8c 100644 > --- a/tools/arch/x86/lib/insn.c > +++ b/tools/arch/x86/lib/insn.c > @@ -185,6 +185,17 @@ int insn_get_prefixes(struct insn *insn) > if (X86_REX_W(b)) > /* REX.W overrides opnd_size */ > insn->opnd_bytes = 8; > + } else if (inat_is_rex2_prefix(attr)) { > + insn_set_byte(&insn->rex_prefix, 0, b); > + b = peek_nbyte_next(insn_byte_t, insn, 1); > + insn_set_byte(&insn->rex_prefix, 1, b); > + insn->rex_prefix.nbytes = 2; > + insn->next_byte += 2; > + if (X86_REX_W(b)) > + /* REX.W overrides opnd_size */ > + insn->opnd_bytes = 8; > + insn->rex_prefix.got = 1; > + goto vex_end; > } > } > insn->rex_prefix.got = 1; > @@ -294,6 +305,20 @@ int insn_get_opcode(struct insn *insn) > goto end; > } > > + /* Check if there is REX2 prefix or not */ > + if (insn_is_rex2(insn)) { > + if (insn_rex2_m_bit(insn)) { > + /* map 1 is escape 0x0f */ > + insn_attr_t esc_attr = inat_get_opcode_attribute(0x0f); > + > + pfx_id = insn_last_prefix_id(insn); > + insn->attr = inat_get_escape_attribute(op, pfx_id, esc_attr); > + } else { > + insn->attr = inat_get_opcode_attribute(op); > + } > + goto end; > + } > + > insn->attr = inat_get_opcode_attribute(op); > while (inat_is_escape(insn->attr)) { > /* Get escaped opcode */ > diff --git a/tools/arch/x86/tools/gen-insn-attr-x86.awk b/tools/arch/x86/tools/gen-insn-attr-x86.awk > index af38469afd14..3f43aa7d8fef 100644 > --- a/tools/arch/x86/tools/gen-insn-attr-x86.awk > +++ b/tools/arch/x86/tools/gen-insn-attr-x86.awk > @@ -64,7 +64,9 @@ BEGIN { > > modrm_expr = "^([CDEGMNPQRSUVW/][a-z]+|NTA|T[012])" > force64_expr = "\\([df]64\\)" > - rex_expr = "^REX(\\.[XRWB]+)*" > + rex_expr = "^((REX(\\.[XRWB]+)+)|(REX$))" > + rex2_expr = "\\(REX2\\)" > + no_rex2_expr = "\\(!REX2\\)" > fpu_expr = "^ESC" # TODO > > lprefix1_expr = "\\((66|!F3)\\)" > @@ -99,6 +101,7 @@ BEGIN { > prefix_num["VEX+1byte"] = "INAT_PFX_VEX2" > prefix_num["VEX+2byte"] = "INAT_PFX_VEX3" > prefix_num["EVEX"] = "INAT_PFX_EVEX" > + prefix_num["REX2"] = "INAT_PFX_REX2" > > clear_vars() > } > @@ -314,6 +317,10 @@ function convert_operands(count,opnd, i,j,imm,mod) > if (match(ext, force64_expr)) > flags = add_flags(flags, "INAT_FORCE64") > > + # check REX2 not allowed > + if (match(ext, no_rex2_expr)) > + flags = add_flags(flags, "INAT_NO_REX2") > + > # check REX prefix > if (match(opcode, rex_expr)) > flags = add_flags(flags, "INAT_MAKE_PREFIX(INAT_PFX_REX)") > @@ -351,6 +358,8 @@ function convert_operands(count,opnd, i,j,imm,mod) > lptable3[idx] = add_flags(lptable3[idx],flags) > variant = "INAT_VARIANT" > } > + if (match(ext, rex2_expr)) > + table[idx] = add_flags(table[idx], "INAT_REX2_VARIANT") > if (!match(ext, lprefix_expr)){ > table[idx] = add_flags(table[idx],flags) > } > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 05/10] x86/insn: Add support for REX2 prefix to the instruction decoder logic 2024-05-02 18:10 ` Ian Rogers @ 2024-05-03 5:09 ` Adrian Hunter 0 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-03 5:09 UTC (permalink / raw) To: Ian Rogers Cc: linux-kernel, Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, linux-perf-users On 2/05/24 21:10, Ian Rogers wrote: > On Thu, May 2, 2024 at 3:59 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >> >> Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named >> REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31. >> >> The REX2 prefix is effectively an extended version of the REX prefix. >> >> REX2 and EVEX are also used with PUSH/POP instructions to provide a >> Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to >> fast-forward register data between matching PUSH and POP instructions. >> >> REX2 is valid only with opcodes in maps 0 and 1. Similar extension for >> other maps is provided by the EVEX prefix, covered in a separate patch. >> >> Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used >> for a new 64-bit absolute direct jump instruction JMPABS. >> >> Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture >> Specification for details. >> >> Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute >> flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify >> opcodes (only JMPABS) that require a mandatory REX2 prefix >> (INAT_REX2_VARIANT). >> >> Amend logic to read the REX2 prefix and get the opcode attribute for the >> map number (0 or 1) encoded in the REX2 prefix. >> >> Amend the awk script that generates the attribute tables from the opcode >> map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)" >> as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT. >> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> >> --- >> arch/x86/include/asm/inat.h | 11 +++++++++- >> arch/x86/include/asm/insn.h | 25 ++++++++++++++++++---- >> arch/x86/lib/insn.c | 25 ++++++++++++++++++++++ >> arch/x86/tools/gen-insn-attr-x86.awk | 11 +++++++++- >> tools/arch/x86/include/asm/inat.h | 11 +++++++++- >> tools/arch/x86/include/asm/insn.h | 25 ++++++++++++++++++---- >> tools/arch/x86/lib/insn.c | 25 ++++++++++++++++++++++ >> tools/arch/x86/tools/gen-insn-attr-x86.awk | 11 +++++++++- >> 8 files changed, 132 insertions(+), 12 deletions(-) >> >> diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h >> index b56c5741581a..1331bdd39a23 100644 >> --- a/arch/x86/include/asm/inat.h >> +++ b/arch/x86/include/asm/inat.h >> @@ -35,6 +35,8 @@ >> #define INAT_PFX_VEX2 13 /* 2-bytes VEX prefix */ >> #define INAT_PFX_VEX3 14 /* 3-bytes VEX prefix */ >> #define INAT_PFX_EVEX 15 /* EVEX prefix */ >> +/* x86-64 REX2 prefix */ >> +#define INAT_PFX_REX2 16 /* 0xD5 */ >> >> #define INAT_LSTPFX_MAX 3 >> #define INAT_LGCPFX_MAX 11 >> @@ -50,7 +52,7 @@ >> >> /* Legacy prefix */ >> #define INAT_PFX_OFFS 0 >> -#define INAT_PFX_BITS 4 >> +#define INAT_PFX_BITS 5 >> #define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1) >> #define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS) >> /* Escape opcodes */ >> @@ -77,6 +79,8 @@ >> #define INAT_VEXOK (1 << (INAT_FLAG_OFFS + 5)) >> #define INAT_VEXONLY (1 << (INAT_FLAG_OFFS + 6)) >> #define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) >> +#define INAT_NO_REX2 (1 << (INAT_FLAG_OFFS + 8)) >> +#define INAT_REX2_VARIANT (1 << (INAT_FLAG_OFFS + 9)) >> /* Attribute making macros for attribute tables */ >> #define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) >> #define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) >> @@ -128,6 +132,11 @@ static inline int inat_is_rex_prefix(insn_attr_t attr) >> return (attr & INAT_PFX_MASK) == INAT_PFX_REX; >> } >> >> +static inline int inat_is_rex2_prefix(insn_attr_t attr) >> +{ >> + return (attr & INAT_PFX_MASK) == INAT_PFX_REX2; >> +} >> + >> static inline int inat_last_prefix_id(insn_attr_t attr) >> { >> if ((attr & INAT_PFX_MASK) > INAT_LSTPFX_MAX) >> diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h >> index 1b29f58f730f..95249ec1f24e 100644 >> --- a/arch/x86/include/asm/insn.h >> +++ b/arch/x86/include/asm/insn.h >> @@ -112,10 +112,15 @@ struct insn { >> #define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3) >> #define X86_SIB_BASE(sib) ((sib) & 0x07) >> >> -#define X86_REX_W(rex) ((rex) & 8) >> -#define X86_REX_R(rex) ((rex) & 4) >> -#define X86_REX_X(rex) ((rex) & 2) >> -#define X86_REX_B(rex) ((rex) & 1) >> +#define X86_REX2_M(rex) ((rex) & 0x80) /* REX2 M0 */ >> +#define X86_REX2_R(rex) ((rex) & 0x40) /* REX2 R4 */ >> +#define X86_REX2_X(rex) ((rex) & 0x20) /* REX2 X4 */ >> +#define X86_REX2_B(rex) ((rex) & 0x10) /* REX2 B4 */ >> + >> +#define X86_REX_W(rex) ((rex) & 8) /* REX or REX2 W */ >> +#define X86_REX_R(rex) ((rex) & 4) /* REX or REX2 R3 */ >> +#define X86_REX_X(rex) ((rex) & 2) /* REX or REX2 X3 */ >> +#define X86_REX_B(rex) ((rex) & 1) /* REX or REX2 B3 */ >> >> /* VEX bit flags */ >> #define X86_VEX_W(vex) ((vex) & 0x80) /* VEX3 Byte2 */ >> @@ -161,6 +166,18 @@ static inline void insn_get_attribute(struct insn *insn) >> /* Instruction uses RIP-relative addressing */ >> extern int insn_rip_relative(struct insn *insn); >> >> +static inline int insn_is_rex2(struct insn *insn) >> +{ >> + if (!insn->prefixes.got) >> + insn_get_prefixes(insn); >> + return insn->rex_prefix.nbytes == 2; > > It'd be nice to capture that a rex2 prefix is by definition 2 bytes. > Playing devil's advocate, if there were a REX and a REX2 prefix, > couldn't rex_prefix.nbytes be 3? I'm wondering about other prefix > combinations that may confuse this logic, maybe someone dreams up > doing this for say alignment reasons like "rep ret". REX with REX2 is not allowed. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 06/10] x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (4 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 05/10] x86/insn: Add support for REX2 prefix to the instruction decoder logic Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 07/10] x86/insn: Add support for APX EVEX to the instruction decoder logic Adrian Hunter ` (4 subsequent siblings) 10 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users Support for REX2 has been added to the instruction decoder logic and the awk script that generates the attribute tables from the opcode map. Add REX2 prefix byte (0xD5) to the opcode map. Add annotation (!REX2) for map 0/1 opcodes that are reserved under REX2. Add JMPABS to the opcode map and add annotation (REX2) to identify that it has a mandatory REX2 prefix. A separate opcode attribute table is not needed at this time because JMPABS has the same attribute encoding as the MOV instruction that it shares an opcode with i.e. INAT_MOFFSET. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/lib/x86-opcode-map.txt | 148 +++++++++++++------------- tools/arch/x86/lib/x86-opcode-map.txt | 148 +++++++++++++------------- 2 files changed, 152 insertions(+), 144 deletions(-) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 4e9f53581b58..240ef714b64f 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -33,6 +33,10 @@ # - (F2): the last prefix is 0xF2 # - (!F3) : the last prefix is not 0xF3 (including non-last prefix case) # - (66&F2): Both 0x66 and 0xF2 prefixes are specified. +# +# REX2 Prefix +# - (!REX2): REX2 is not allowed +# - (REX2): REX2 variant e.g. JMPABS Table: one byte opcode Referrer: @@ -157,22 +161,22 @@ AVXcode: 6e: OUTS/OUTSB DX,Xb 6f: OUTS/OUTSW/OUTSD DX,Xz # 0x70 - 0x7f -70: JO Jb -71: JNO Jb -72: JB/JNAE/JC Jb -73: JNB/JAE/JNC Jb -74: JZ/JE Jb -75: JNZ/JNE Jb -76: JBE/JNA Jb -77: JNBE/JA Jb -78: JS Jb -79: JNS Jb -7a: JP/JPE Jb -7b: JNP/JPO Jb -7c: JL/JNGE Jb -7d: JNL/JGE Jb -7e: JLE/JNG Jb -7f: JNLE/JG Jb +70: JO Jb (!REX2) +71: JNO Jb (!REX2) +72: JB/JNAE/JC Jb (!REX2) +73: JNB/JAE/JNC Jb (!REX2) +74: JZ/JE Jb (!REX2) +75: JNZ/JNE Jb (!REX2) +76: JBE/JNA Jb (!REX2) +77: JNBE/JA Jb (!REX2) +78: JS Jb (!REX2) +79: JNS Jb (!REX2) +7a: JP/JPE Jb (!REX2) +7b: JNP/JPO Jb (!REX2) +7c: JL/JNGE Jb (!REX2) +7d: JNL/JGE Jb (!REX2) +7e: JLE/JNG Jb (!REX2) +7f: JNLE/JG Jb (!REX2) # 0x80 - 0x8f 80: Grp1 Eb,Ib (1A) 81: Grp1 Ev,Iz (1A) @@ -208,24 +212,24 @@ AVXcode: 9e: SAHF 9f: LAHF # 0xa0 - 0xaf -a0: MOV AL,Ob -a1: MOV rAX,Ov -a2: MOV Ob,AL -a3: MOV Ov,rAX -a4: MOVS/B Yb,Xb -a5: MOVS/W/D/Q Yv,Xv -a6: CMPS/B Xb,Yb -a7: CMPS/W/D Xv,Yv -a8: TEST AL,Ib -a9: TEST rAX,Iz -aa: STOS/B Yb,AL -ab: STOS/W/D/Q Yv,rAX -ac: LODS/B AL,Xb -ad: LODS/W/D/Q rAX,Xv -ae: SCAS/B AL,Yb +a0: MOV AL,Ob (!REX2) +a1: MOV rAX,Ov (!REX2) | JMPABS O (REX2),(o64) +a2: MOV Ob,AL (!REX2) +a3: MOV Ov,rAX (!REX2) +a4: MOVS/B Yb,Xb (!REX2) +a5: MOVS/W/D/Q Yv,Xv (!REX2) +a6: CMPS/B Xb,Yb (!REX2) +a7: CMPS/W/D Xv,Yv (!REX2) +a8: TEST AL,Ib (!REX2) +a9: TEST rAX,Iz (!REX2) +aa: STOS/B Yb,AL (!REX2) +ab: STOS/W/D/Q Yv,rAX (!REX2) +ac: LODS/B AL,Xb (!REX2) +ad: LODS/W/D/Q rAX,Xv (!REX2) +ae: SCAS/B AL,Yb (!REX2) # Note: The May 2011 Intel manual shows Xv for the second parameter of the # next instruction but Yv is correct -af: SCAS/W/D/Q rAX,Yv +af: SCAS/W/D/Q rAX,Yv (!REX2) # 0xb0 - 0xbf b0: MOV AL/R8L,Ib b1: MOV CL/R9L,Ib @@ -266,7 +270,7 @@ d1: Grp2 Ev,1 (1A) d2: Grp2 Eb,CL (1A) d3: Grp2 Ev,CL (1A) d4: AAM Ib (i64) -d5: AAD Ib (i64) +d5: AAD Ib (i64) | REX2 (Prefix),(o64) d6: d7: XLAT/XLATB d8: ESC @@ -281,26 +285,26 @@ df: ESC # Note: "forced64" is Intel CPU behavior: they ignore 0x66 prefix # in 64-bit mode. AMD CPUs accept 0x66 prefix, it causes RIP truncation # to 16 bits. In 32-bit mode, 0x66 is accepted by both Intel and AMD. -e0: LOOPNE/LOOPNZ Jb (f64) -e1: LOOPE/LOOPZ Jb (f64) -e2: LOOP Jb (f64) -e3: JrCXZ Jb (f64) -e4: IN AL,Ib -e5: IN eAX,Ib -e6: OUT Ib,AL -e7: OUT Ib,eAX +e0: LOOPNE/LOOPNZ Jb (f64) (!REX2) +e1: LOOPE/LOOPZ Jb (f64) (!REX2) +e2: LOOP Jb (f64) (!REX2) +e3: JrCXZ Jb (f64) (!REX2) +e4: IN AL,Ib (!REX2) +e5: IN eAX,Ib (!REX2) +e6: OUT Ib,AL (!REX2) +e7: OUT Ib,eAX (!REX2) # With 0x66 prefix in 64-bit mode, for AMD CPUs immediate offset # in "near" jumps and calls is 16-bit. For CALL, # push of return address is 16-bit wide, RSP is decremented by 2 # but is not truncated to 16 bits, unlike RIP. -e8: CALL Jz (f64) -e9: JMP-near Jz (f64) -ea: JMP-far Ap (i64) -eb: JMP-short Jb (f64) -ec: IN AL,DX -ed: IN eAX,DX -ee: OUT DX,AL -ef: OUT DX,eAX +e8: CALL Jz (f64) (!REX2) +e9: JMP-near Jz (f64) (!REX2) +ea: JMP-far Ap (i64) (!REX2) +eb: JMP-short Jb (f64) (!REX2) +ec: IN AL,DX (!REX2) +ed: IN eAX,DX (!REX2) +ee: OUT DX,AL (!REX2) +ef: OUT DX,eAX (!REX2) # 0xf0 - 0xff f0: LOCK (Prefix) f1: @@ -386,14 +390,14 @@ AVXcode: 1 2e: vucomiss Vss,Wss (v1) | vucomisd Vsd,Wsd (66),(v1) 2f: vcomiss Vss,Wss (v1) | vcomisd Vsd,Wsd (66),(v1) # 0x0f 0x30-0x3f -30: WRMSR -31: RDTSC -32: RDMSR -33: RDPMC -34: SYSENTER -35: SYSEXIT +30: WRMSR (!REX2) +31: RDTSC (!REX2) +32: RDMSR (!REX2) +33: RDPMC (!REX2) +34: SYSENTER (!REX2) +35: SYSEXIT (!REX2) 36: -37: GETSEC +37: GETSEC (!REX2) 38: escape # 3-byte escape 1 39: 3a: escape # 3-byte escape 2 @@ -473,22 +477,22 @@ AVXcode: 1 7f: movq Qq,Pq | vmovdqa Wx,Vx (66) | vmovdqa32/64 Wx,Vx (66),(evo) | vmovdqu Wx,Vx (F3) | vmovdqu32/64 Wx,Vx (F3),(evo) | vmovdqu8/16 Wx,Vx (F2),(ev) # 0x0f 0x80-0x8f # Note: "forced64" is Intel CPU behavior (see comment about CALL insn). -80: JO Jz (f64) -81: JNO Jz (f64) -82: JB/JC/JNAE Jz (f64) -83: JAE/JNB/JNC Jz (f64) -84: JE/JZ Jz (f64) -85: JNE/JNZ Jz (f64) -86: JBE/JNA Jz (f64) -87: JA/JNBE Jz (f64) -88: JS Jz (f64) -89: JNS Jz (f64) -8a: JP/JPE Jz (f64) -8b: JNP/JPO Jz (f64) -8c: JL/JNGE Jz (f64) -8d: JNL/JGE Jz (f64) -8e: JLE/JNG Jz (f64) -8f: JNLE/JG Jz (f64) +80: JO Jz (f64) (!REX2) +81: JNO Jz (f64) (!REX2) +82: JB/JC/JNAE Jz (f64) (!REX2) +83: JAE/JNB/JNC Jz (f64) (!REX2) +84: JE/JZ Jz (f64) (!REX2) +85: JNE/JNZ Jz (f64) (!REX2) +86: JBE/JNA Jz (f64) (!REX2) +87: JA/JNBE Jz (f64) (!REX2) +88: JS Jz (f64) (!REX2) +89: JNS Jz (f64) (!REX2) +8a: JP/JPE Jz (f64) (!REX2) +8b: JNP/JPO Jz (f64) (!REX2) +8c: JL/JNGE Jz (f64) (!REX2) +8d: JNL/JGE Jz (f64) (!REX2) +8e: JLE/JNG Jz (f64) (!REX2) +8f: JNLE/JG Jz (f64) (!REX2) # 0x0f 0x90-0x9f 90: SETO Eb | kmovw/q Vk,Wk | kmovb/d Vk,Wk (66) 91: SETNO Eb | kmovw/q Mv,Vk | kmovb/d Mv,Vk (66) diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index 4e9f53581b58..240ef714b64f 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -33,6 +33,10 @@ # - (F2): the last prefix is 0xF2 # - (!F3) : the last prefix is not 0xF3 (including non-last prefix case) # - (66&F2): Both 0x66 and 0xF2 prefixes are specified. +# +# REX2 Prefix +# - (!REX2): REX2 is not allowed +# - (REX2): REX2 variant e.g. JMPABS Table: one byte opcode Referrer: @@ -157,22 +161,22 @@ AVXcode: 6e: OUTS/OUTSB DX,Xb 6f: OUTS/OUTSW/OUTSD DX,Xz # 0x70 - 0x7f -70: JO Jb -71: JNO Jb -72: JB/JNAE/JC Jb -73: JNB/JAE/JNC Jb -74: JZ/JE Jb -75: JNZ/JNE Jb -76: JBE/JNA Jb -77: JNBE/JA Jb -78: JS Jb -79: JNS Jb -7a: JP/JPE Jb -7b: JNP/JPO Jb -7c: JL/JNGE Jb -7d: JNL/JGE Jb -7e: JLE/JNG Jb -7f: JNLE/JG Jb +70: JO Jb (!REX2) +71: JNO Jb (!REX2) +72: JB/JNAE/JC Jb (!REX2) +73: JNB/JAE/JNC Jb (!REX2) +74: JZ/JE Jb (!REX2) +75: JNZ/JNE Jb (!REX2) +76: JBE/JNA Jb (!REX2) +77: JNBE/JA Jb (!REX2) +78: JS Jb (!REX2) +79: JNS Jb (!REX2) +7a: JP/JPE Jb (!REX2) +7b: JNP/JPO Jb (!REX2) +7c: JL/JNGE Jb (!REX2) +7d: JNL/JGE Jb (!REX2) +7e: JLE/JNG Jb (!REX2) +7f: JNLE/JG Jb (!REX2) # 0x80 - 0x8f 80: Grp1 Eb,Ib (1A) 81: Grp1 Ev,Iz (1A) @@ -208,24 +212,24 @@ AVXcode: 9e: SAHF 9f: LAHF # 0xa0 - 0xaf -a0: MOV AL,Ob -a1: MOV rAX,Ov -a2: MOV Ob,AL -a3: MOV Ov,rAX -a4: MOVS/B Yb,Xb -a5: MOVS/W/D/Q Yv,Xv -a6: CMPS/B Xb,Yb -a7: CMPS/W/D Xv,Yv -a8: TEST AL,Ib -a9: TEST rAX,Iz -aa: STOS/B Yb,AL -ab: STOS/W/D/Q Yv,rAX -ac: LODS/B AL,Xb -ad: LODS/W/D/Q rAX,Xv -ae: SCAS/B AL,Yb +a0: MOV AL,Ob (!REX2) +a1: MOV rAX,Ov (!REX2) | JMPABS O (REX2),(o64) +a2: MOV Ob,AL (!REX2) +a3: MOV Ov,rAX (!REX2) +a4: MOVS/B Yb,Xb (!REX2) +a5: MOVS/W/D/Q Yv,Xv (!REX2) +a6: CMPS/B Xb,Yb (!REX2) +a7: CMPS/W/D Xv,Yv (!REX2) +a8: TEST AL,Ib (!REX2) +a9: TEST rAX,Iz (!REX2) +aa: STOS/B Yb,AL (!REX2) +ab: STOS/W/D/Q Yv,rAX (!REX2) +ac: LODS/B AL,Xb (!REX2) +ad: LODS/W/D/Q rAX,Xv (!REX2) +ae: SCAS/B AL,Yb (!REX2) # Note: The May 2011 Intel manual shows Xv for the second parameter of the # next instruction but Yv is correct -af: SCAS/W/D/Q rAX,Yv +af: SCAS/W/D/Q rAX,Yv (!REX2) # 0xb0 - 0xbf b0: MOV AL/R8L,Ib b1: MOV CL/R9L,Ib @@ -266,7 +270,7 @@ d1: Grp2 Ev,1 (1A) d2: Grp2 Eb,CL (1A) d3: Grp2 Ev,CL (1A) d4: AAM Ib (i64) -d5: AAD Ib (i64) +d5: AAD Ib (i64) | REX2 (Prefix),(o64) d6: d7: XLAT/XLATB d8: ESC @@ -281,26 +285,26 @@ df: ESC # Note: "forced64" is Intel CPU behavior: they ignore 0x66 prefix # in 64-bit mode. AMD CPUs accept 0x66 prefix, it causes RIP truncation # to 16 bits. In 32-bit mode, 0x66 is accepted by both Intel and AMD. -e0: LOOPNE/LOOPNZ Jb (f64) -e1: LOOPE/LOOPZ Jb (f64) -e2: LOOP Jb (f64) -e3: JrCXZ Jb (f64) -e4: IN AL,Ib -e5: IN eAX,Ib -e6: OUT Ib,AL -e7: OUT Ib,eAX +e0: LOOPNE/LOOPNZ Jb (f64) (!REX2) +e1: LOOPE/LOOPZ Jb (f64) (!REX2) +e2: LOOP Jb (f64) (!REX2) +e3: JrCXZ Jb (f64) (!REX2) +e4: IN AL,Ib (!REX2) +e5: IN eAX,Ib (!REX2) +e6: OUT Ib,AL (!REX2) +e7: OUT Ib,eAX (!REX2) # With 0x66 prefix in 64-bit mode, for AMD CPUs immediate offset # in "near" jumps and calls is 16-bit. For CALL, # push of return address is 16-bit wide, RSP is decremented by 2 # but is not truncated to 16 bits, unlike RIP. -e8: CALL Jz (f64) -e9: JMP-near Jz (f64) -ea: JMP-far Ap (i64) -eb: JMP-short Jb (f64) -ec: IN AL,DX -ed: IN eAX,DX -ee: OUT DX,AL -ef: OUT DX,eAX +e8: CALL Jz (f64) (!REX2) +e9: JMP-near Jz (f64) (!REX2) +ea: JMP-far Ap (i64) (!REX2) +eb: JMP-short Jb (f64) (!REX2) +ec: IN AL,DX (!REX2) +ed: IN eAX,DX (!REX2) +ee: OUT DX,AL (!REX2) +ef: OUT DX,eAX (!REX2) # 0xf0 - 0xff f0: LOCK (Prefix) f1: @@ -386,14 +390,14 @@ AVXcode: 1 2e: vucomiss Vss,Wss (v1) | vucomisd Vsd,Wsd (66),(v1) 2f: vcomiss Vss,Wss (v1) | vcomisd Vsd,Wsd (66),(v1) # 0x0f 0x30-0x3f -30: WRMSR -31: RDTSC -32: RDMSR -33: RDPMC -34: SYSENTER -35: SYSEXIT +30: WRMSR (!REX2) +31: RDTSC (!REX2) +32: RDMSR (!REX2) +33: RDPMC (!REX2) +34: SYSENTER (!REX2) +35: SYSEXIT (!REX2) 36: -37: GETSEC +37: GETSEC (!REX2) 38: escape # 3-byte escape 1 39: 3a: escape # 3-byte escape 2 @@ -473,22 +477,22 @@ AVXcode: 1 7f: movq Qq,Pq | vmovdqa Wx,Vx (66) | vmovdqa32/64 Wx,Vx (66),(evo) | vmovdqu Wx,Vx (F3) | vmovdqu32/64 Wx,Vx (F3),(evo) | vmovdqu8/16 Wx,Vx (F2),(ev) # 0x0f 0x80-0x8f # Note: "forced64" is Intel CPU behavior (see comment about CALL insn). -80: JO Jz (f64) -81: JNO Jz (f64) -82: JB/JC/JNAE Jz (f64) -83: JAE/JNB/JNC Jz (f64) -84: JE/JZ Jz (f64) -85: JNE/JNZ Jz (f64) -86: JBE/JNA Jz (f64) -87: JA/JNBE Jz (f64) -88: JS Jz (f64) -89: JNS Jz (f64) -8a: JP/JPE Jz (f64) -8b: JNP/JPO Jz (f64) -8c: JL/JNGE Jz (f64) -8d: JNL/JGE Jz (f64) -8e: JLE/JNG Jz (f64) -8f: JNLE/JG Jz (f64) +80: JO Jz (f64) (!REX2) +81: JNO Jz (f64) (!REX2) +82: JB/JC/JNAE Jz (f64) (!REX2) +83: JAE/JNB/JNC Jz (f64) (!REX2) +84: JE/JZ Jz (f64) (!REX2) +85: JNE/JNZ Jz (f64) (!REX2) +86: JBE/JNA Jz (f64) (!REX2) +87: JA/JNBE Jz (f64) (!REX2) +88: JS Jz (f64) (!REX2) +89: JNS Jz (f64) (!REX2) +8a: JP/JPE Jz (f64) (!REX2) +8b: JNP/JPO Jz (f64) (!REX2) +8c: JL/JNGE Jz (f64) (!REX2) +8d: JNL/JGE Jz (f64) (!REX2) +8e: JLE/JNG Jz (f64) (!REX2) +8f: JNLE/JG Jz (f64) (!REX2) # 0x0f 0x90-0x9f 90: SETO Eb | kmovw/q Vk,Wk | kmovb/d Vk,Wk (66) 91: SETNO Eb | kmovw/q Mv,Vk | kmovb/d Mv,Vk (66) -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 07/10] x86/insn: Add support for APX EVEX to the instruction decoder logic 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (5 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 06/10] x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 08/10] x86/insn: Add support for APX EVEX instructions to the opcode map Adrian Hunter ` (3 subsequent siblings) 10 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users Intel Advanced Performance Extensions (APX) extends the EVEX prefix to support: - extended general purpose registers (EGPRs) i.e. r16 to r31 - Push-Pop Acceleration (PPX) hints - new data destination (NDD) register - suppress status flags writes (NF) of common instructions - new instructions Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. The extended EVEX prefix does not need amended instruction decoder logic, except in one area. Some instructions are defined as SCALABLE which means the EVEX.W bit and EVEX.pp bits are used to determine operand size. Specifically, if an instruction is SCALABLE and EVEX.W is zero, then EVEX.pp value 0 (representing no prefix NP) means default operand size, whereas EVEX.pp value 1 (representing 66 prefix) means operand size override i.e. 16 bits Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and amend the logic appropriately. Amend the awk script that generates the attribute tables from the opcode map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/include/asm/inat.h | 6 ++++++ arch/x86/include/asm/insn.h | 7 +++++++ arch/x86/lib/insn.c | 4 ++++ arch/x86/tools/gen-insn-attr-x86.awk | 4 ++++ tools/arch/x86/include/asm/inat.h | 6 ++++++ tools/arch/x86/include/asm/insn.h | 7 +++++++ tools/arch/x86/lib/insn.c | 4 ++++ tools/arch/x86/tools/gen-insn-attr-x86.awk | 4 ++++ 8 files changed, 42 insertions(+) diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h index 1331bdd39a23..53e4015242b4 100644 --- a/arch/x86/include/asm/inat.h +++ b/arch/x86/include/asm/inat.h @@ -81,6 +81,7 @@ #define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) #define INAT_NO_REX2 (1 << (INAT_FLAG_OFFS + 8)) #define INAT_REX2_VARIANT (1 << (INAT_FLAG_OFFS + 9)) +#define INAT_EVEX_SCALABLE (1 << (INAT_FLAG_OFFS + 10)) /* Attribute making macros for attribute tables */ #define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) #define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) @@ -236,4 +237,9 @@ static inline int inat_must_evex(insn_attr_t attr) { return attr & INAT_EVEXONLY; } + +static inline int inat_evex_scalable(insn_attr_t attr) +{ + return attr & INAT_EVEX_SCALABLE; +} #endif diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h index 95249ec1f24e..7152ea809e6a 100644 --- a/arch/x86/include/asm/insn.h +++ b/arch/x86/include/asm/insn.h @@ -215,6 +215,13 @@ static inline insn_byte_t insn_vex_p_bits(struct insn *insn) return X86_VEX_P(insn->vex_prefix.bytes[2]); } +static inline insn_byte_t insn_vex_w_bit(struct insn *insn) +{ + if (insn->vex_prefix.nbytes < 3) + return 0; + return X86_VEX_W(insn->vex_prefix.bytes[2]); +} + /* Get the last prefix id from last prefix or VEX prefix */ static inline int insn_last_prefix_id(struct insn *insn) { diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c index 6126ddc6e5f5..5952ab41c60f 100644 --- a/arch/x86/lib/insn.c +++ b/arch/x86/lib/insn.c @@ -294,6 +294,10 @@ int insn_get_opcode(struct insn *insn) m = insn_vex_m_bits(insn); p = insn_vex_p_bits(insn); insn->attr = inat_get_avx_attribute(op, m, p); + /* SCALABLE EVEX uses p bits to encode operand size */ + if (inat_evex_scalable(insn->attr) && !insn_vex_w_bit(insn) && + p == INAT_PFX_OPNDSZ) + insn->opnd_bytes = 2; if ((inat_must_evex(insn->attr) && !insn_is_evex(insn)) || (!inat_accept_vex(insn->attr) && !inat_is_group(insn->attr))) { diff --git a/arch/x86/tools/gen-insn-attr-x86.awk b/arch/x86/tools/gen-insn-attr-x86.awk index 3f43aa7d8fef..5770c8097f32 100644 --- a/arch/x86/tools/gen-insn-attr-x86.awk +++ b/arch/x86/tools/gen-insn-attr-x86.awk @@ -83,6 +83,8 @@ BEGIN { vexonly_expr = "\\(v\\)" # All opcodes with (ev) superscript supports *only* EVEX prefix evexonly_expr = "\\(ev\\)" + # (es) is the same as (ev) but also "SCALABLE" i.e. W and pp determine operand size + evex_scalable_expr = "\\(es\\)" prefix_expr = "\\(Prefix\\)" prefix_num["Operand-Size"] = "INAT_PFX_OPNDSZ" @@ -332,6 +334,8 @@ function convert_operands(count,opnd, i,j,imm,mod) # check VEX codes if (match(ext, evexonly_expr)) flags = add_flags(flags, "INAT_VEXOK | INAT_EVEXONLY") + else if (match(ext, evex_scalable_expr)) + flags = add_flags(flags, "INAT_VEXOK | INAT_EVEXONLY | INAT_EVEX_SCALABLE") else if (match(ext, vexonly_expr)) flags = add_flags(flags, "INAT_VEXOK | INAT_VEXONLY") else if (match(ext, vexok_expr) || match(opcode, vexok_opcode_expr)) diff --git a/tools/arch/x86/include/asm/inat.h b/tools/arch/x86/include/asm/inat.h index 2e65312cae52..253690eb3c26 100644 --- a/tools/arch/x86/include/asm/inat.h +++ b/tools/arch/x86/include/asm/inat.h @@ -81,6 +81,7 @@ #define INAT_EVEXONLY (1 << (INAT_FLAG_OFFS + 7)) #define INAT_NO_REX2 (1 << (INAT_FLAG_OFFS + 8)) #define INAT_REX2_VARIANT (1 << (INAT_FLAG_OFFS + 9)) +#define INAT_EVEX_SCALABLE (1 << (INAT_FLAG_OFFS + 10)) /* Attribute making macros for attribute tables */ #define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS) #define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS) @@ -236,4 +237,9 @@ static inline int inat_must_evex(insn_attr_t attr) { return attr & INAT_EVEXONLY; } + +static inline int inat_evex_scalable(insn_attr_t attr) +{ + return attr & INAT_EVEX_SCALABLE; +} #endif diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h index 1a7e8fc4d75a..0e5abd896ad4 100644 --- a/tools/arch/x86/include/asm/insn.h +++ b/tools/arch/x86/include/asm/insn.h @@ -215,6 +215,13 @@ static inline insn_byte_t insn_vex_p_bits(struct insn *insn) return X86_VEX_P(insn->vex_prefix.bytes[2]); } +static inline insn_byte_t insn_vex_w_bit(struct insn *insn) +{ + if (insn->vex_prefix.nbytes < 3) + return 0; + return X86_VEX_W(insn->vex_prefix.bytes[2]); +} + /* Get the last prefix id from last prefix or VEX prefix */ static inline int insn_last_prefix_id(struct insn *insn) { diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c index f761adeb8e8c..a43b37346a22 100644 --- a/tools/arch/x86/lib/insn.c +++ b/tools/arch/x86/lib/insn.c @@ -294,6 +294,10 @@ int insn_get_opcode(struct insn *insn) m = insn_vex_m_bits(insn); p = insn_vex_p_bits(insn); insn->attr = inat_get_avx_attribute(op, m, p); + /* SCALABLE EVEX uses p bits to encode operand size */ + if (inat_evex_scalable(insn->attr) && !insn_vex_w_bit(insn) && + p == INAT_PFX_OPNDSZ) + insn->opnd_bytes = 2; if ((inat_must_evex(insn->attr) && !insn_is_evex(insn)) || (!inat_accept_vex(insn->attr) && !inat_is_group(insn->attr))) { diff --git a/tools/arch/x86/tools/gen-insn-attr-x86.awk b/tools/arch/x86/tools/gen-insn-attr-x86.awk index 3f43aa7d8fef..5770c8097f32 100644 --- a/tools/arch/x86/tools/gen-insn-attr-x86.awk +++ b/tools/arch/x86/tools/gen-insn-attr-x86.awk @@ -83,6 +83,8 @@ BEGIN { vexonly_expr = "\\(v\\)" # All opcodes with (ev) superscript supports *only* EVEX prefix evexonly_expr = "\\(ev\\)" + # (es) is the same as (ev) but also "SCALABLE" i.e. W and pp determine operand size + evex_scalable_expr = "\\(es\\)" prefix_expr = "\\(Prefix\\)" prefix_num["Operand-Size"] = "INAT_PFX_OPNDSZ" @@ -332,6 +334,8 @@ function convert_operands(count,opnd, i,j,imm,mod) # check VEX codes if (match(ext, evexonly_expr)) flags = add_flags(flags, "INAT_VEXOK | INAT_EVEXONLY") + else if (match(ext, evex_scalable_expr)) + flags = add_flags(flags, "INAT_VEXOK | INAT_EVEXONLY | INAT_EVEX_SCALABLE") else if (match(ext, vexonly_expr)) flags = add_flags(flags, "INAT_VEXOK | INAT_VEXONLY") else if (match(ext, vexok_expr) || match(opcode, vexok_opcode_expr)) -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 08/10] x86/insn: Add support for APX EVEX instructions to the opcode map 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (6 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 07/10] x86/insn: Add support for APX EVEX to the instruction decoder logic Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 09/10] perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder Adrian Hunter ` (2 subsequent siblings) 10 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users To support APX functionality, the EVEX prefix is used to: - promote legacy instructions - promote VEX instructions - add new instructions Promoted VEX instructions require no extra annotation because the opcodes do not change and the permissive nature of the instruction decoder already allows them to have an EVEX prefix. Promoted legacy instructions and new instructions are placed in map 4 which has not been used before. Create a new table for map 4 and add APX instructions. Annotate SCALABLE instructions with "(es)" - refer to patch "x86/insn: Add support for APX EVEX to the instruction decoder logic". SCALABLE instructions must be represented in both no-prefix (NP) and 66 prefix forms. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- arch/x86/lib/x86-opcode-map.txt | 93 +++++++++++++++++++++++++++ tools/arch/x86/lib/x86-opcode-map.txt | 93 +++++++++++++++++++++++++++ 2 files changed, 186 insertions(+) diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt index 240ef714b64f..caedb3ef6688 100644 --- a/arch/x86/lib/x86-opcode-map.txt +++ b/arch/x86/lib/x86-opcode-map.txt @@ -23,6 +23,7 @@ # # AVX Superscripts # (ev): this opcode requires EVEX prefix. +# (es): this opcode requires EVEX prefix and is SCALABALE. # (evo): this opcode is changed by EVEX prefix (EVEX opcode) # (v): this opcode requires VEX prefix. # (v1): this opcode only supports 128bit VEX. @@ -929,6 +930,98 @@ df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1) f0: RORX Gy,Ey,Ib (F2),(v) | HRESET Gv,Ib (F3),(000),(11B) EndTable +Table: EVEX map 4 +Referrer: +AVXcode: 4 +00: ADD Eb,Gb (ev) +01: ADD Ev,Gv (es) | ADD Ev,Gv (66),(es) +02: ADD Gb,Eb (ev) +03: ADD Gv,Ev (es) | ADD Gv,Ev (66),(es) +08: OR Eb,Gb (ev) +09: OR Ev,Gv (es) | OR Ev,Gv (66),(es) +0a: OR Gb,Eb (ev) +0b: OR Gv,Ev (es) | OR Gv,Ev (66),(es) +10: ADC Eb,Gb (ev) +11: ADC Ev,Gv (es) | ADC Ev,Gv (66),(es) +12: ADC Gb,Eb (ev) +13: ADC Gv,Ev (es) | ADC Gv,Ev (66),(es) +18: SBB Eb,Gb (ev) +19: SBB Ev,Gv (es) | SBB Ev,Gv (66),(es) +1a: SBB Gb,Eb (ev) +1b: SBB Gv,Ev (es) | SBB Gv,Ev (66),(es) +20: AND Eb,Gb (ev) +21: AND Ev,Gv (es) | AND Ev,Gv (66),(es) +22: AND Gb,Eb (ev) +23: AND Gv,Ev (es) | AND Gv,Ev (66),(es) +24: SHLD Ev,Gv,Ib (es) | SHLD Ev,Gv,Ib (66),(es) +28: SUB Eb,Gb (ev) +29: SUB Ev,Gv (es) | SUB Ev,Gv (66),(es) +2a: SUB Gb,Eb (ev) +2b: SUB Gv,Ev (es) | SUB Gv,Ev (66),(es) +2c: SHRD Ev,Gv,Ib (es) | SHRD Ev,Gv,Ib (66),(es) +30: XOR Eb,Gb (ev) +31: XOR Ev,Gv (es) | XOR Ev,Gv (66),(es) +32: XOR Gb,Eb (ev) +33: XOR Gv,Ev (es) | XOR Gv,Ev (66),(es) +# CCMPSCC instructions are: CCOMB, CCOMBE, CCOMF, CCOML, CCOMLE, CCOMNB, CCOMNBE, CCOMNL, CCOMNLE, +# CCOMNO, CCOMNS, CCOMNZ, CCOMO, CCOMS, CCOMT, CCOMZ +38: CCMPSCC Eb,Gb (ev) +39: CCMPSCC Ev,Gv (es) | CCMPSCC Ev,Gv (66),(es) +3a: CCMPSCC Gv,Ev (ev) +3b: CCMPSCC Gv,Ev (es) | CCMPSCC Gv,Ev (66),(es) +40: CMOVO Gv,Ev (es) | CMOVO Gv,Ev (66),(es) | CFCMOVO Ev,Ev (es) | CFCMOVO Ev,Ev (66),(es) | SETO Eb (F2),(ev) +41: CMOVNO Gv,Ev (es) | CMOVNO Gv,Ev (66),(es) | CFCMOVNO Ev,Ev (es) | CFCMOVNO Ev,Ev (66),(es) | SETNO Eb (F2),(ev) +42: CMOVB Gv,Ev (es) | CMOVB Gv,Ev (66),(es) | CFCMOVB Ev,Ev (es) | CFCMOVB Ev,Ev (66),(es) | SETB Eb (F2),(ev) +43: CMOVNB Gv,Ev (es) | CMOVNB Gv,Ev (66),(es) | CFCMOVNB Ev,Ev (es) | CFCMOVNB Ev,Ev (66),(es) | SETNB Eb (F2),(ev) +44: CMOVZ Gv,Ev (es) | CMOVZ Gv,Ev (66),(es) | CFCMOVZ Ev,Ev (es) | CFCMOVZ Ev,Ev (66),(es) | SETZ Eb (F2),(ev) +45: CMOVNZ Gv,Ev (es) | CMOVNZ Gv,Ev (66),(es) | CFCMOVNZ Ev,Ev (es) | CFCMOVNZ Ev,Ev (66),(es) | SETNZ Eb (F2),(ev) +46: CMOVBE Gv,Ev (es) | CMOVBE Gv,Ev (66),(es) | CFCMOVBE Ev,Ev (es) | CFCMOVBE Ev,Ev (66),(es) | SETBE Eb (F2),(ev) +47: CMOVNBE Gv,Ev (es) | CMOVNBE Gv,Ev (66),(es) | CFCMOVNBE Ev,Ev (es) | CFCMOVNBE Ev,Ev (66),(es) | SETNBE Eb (F2),(ev) +48: CMOVS Gv,Ev (es) | CMOVS Gv,Ev (66),(es) | CFCMOVS Ev,Ev (es) | CFCMOVS Ev,Ev (66),(es) | SETS Eb (F2),(ev) +49: CMOVNS Gv,Ev (es) | CMOVNS Gv,Ev (66),(es) | CFCMOVNS Ev,Ev (es) | CFCMOVNS Ev,Ev (66),(es) | SETNS Eb (F2),(ev) +4a: CMOVP Gv,Ev (es) | CMOVP Gv,Ev (66),(es) | CFCMOVP Ev,Ev (es) | CFCMOVP Ev,Ev (66),(es) | SETP Eb (F2),(ev) +4b: CMOVNP Gv,Ev (es) | CMOVNP Gv,Ev (66),(es) | CFCMOVNP Ev,Ev (es) | CFCMOVNP Ev,Ev (66),(es) | SETNP Eb (F2),(ev) +4c: CMOVL Gv,Ev (es) | CMOVL Gv,Ev (66),(es) | CFCMOVL Ev,Ev (es) | CFCMOVL Ev,Ev (66),(es) | SETL Eb (F2),(ev) +4d: CMOVNL Gv,Ev (es) | CMOVNL Gv,Ev (66),(es) | CFCMOVNL Ev,Ev (es) | CFCMOVNL Ev,Ev (66),(es) | SETNL Eb (F2),(ev) +4e: CMOVLE Gv,Ev (es) | CMOVLE Gv,Ev (66),(es) | CFCMOVLE Ev,Ev (es) | CFCMOVLE Ev,Ev (66),(es) | SETLE Eb (F2),(ev) +4f: CMOVNLE Gv,Ev (es) | CMOVNLE Gv,Ev (66),(es) | CFCMOVNLE Ev,Ev (es) | CFCMOVNLE Ev,Ev (66),(es) | SETNLE Eb (F2),(ev) +60: MOVBE Gv,Ev (es) | MOVBE Gv,Ev (66),(es) +61: MOVBE Ev,Gv (es) | MOVBE Ev,Gv (66),(es) +65: WRUSSD Md,Gd (66),(ev) | WRUSSQ Mq,Gq (66),(ev) +66: ADCX Gy,Ey (66),(ev) | ADOX Gy,Ey (F3),(ev) | WRSSD Md,Gd (ev) | WRSSQ Mq,Gq (66),(ev) +69: IMUL Gv,Ev,Iz (es) | IMUL Gv,Ev,Iz (66),(es) +6b: IMUL Gv,Ev,Ib (es) | IMUL Gv,Ev,Ib (66),(es) +80: Grp1 Eb,Ib (1A),(ev) +81: Grp1 Ev,Iz (1A),(es) +83: Grp1 Ev,Ib (1A),(es) +# CTESTSCC instructions are: CTESTB, CTESTBE, CTESTF, CTESTL, CTESTLE, CTESTNB, CTESTNBE, CTESTNL, +# CTESTNLE, CTESTNO, CTESTNS, CTESTNZ, CTESTO, CTESTS, CTESTT, CTESTZ +84: CTESTSCC (ev) +85: CTESTSCC (es) | CTESTSCC (66),(es) +88: POPCNT Gv,Ev (es) | POPCNT Gv,Ev (66),(es) +8f: POP2 Bq,Rq (000),(11B),(ev) +a5: SHLD Ev,Gv,CL (es) | SHLD Ev,Gv,CL (66),(es) +ad: SHRD Ev,Gv,CL (es) | SHRD Ev,Gv,CL (66),(es) +af: IMUL Gv,Ev (es) | IMUL Gv,Ev (66),(es) +c0: Grp2 Eb,Ib (1A),(ev) +c1: Grp2 Ev,Ib (1A),(es) +d0: Grp2 Eb,1 (1A),(ev) +d1: Grp2 Ev,1 (1A),(es) +d2: Grp2 Eb,CL (1A),(ev) +d3: Grp2 Ev,CL (1A),(es) +f0: CRC32 Gy,Eb (es) | INVEPT Gq,Mdq (F3),(ev) +f1: CRC32 Gy,Ey (es) | CRC32 Gy,Ey (66),(es) | INVVPID Gy,Mdq (F3),(ev) +f2: INVPCID Gy,Mdq (F3),(ev) +f4: TZCNT Gv,Ev (es) | TZCNT Gv,Ev (66),(es) +f5: LZCNT Gv,Ev (es) | LZCNT Gv,Ev (66),(es) +f6: Grp3_1 Eb (1A),(ev) +f7: Grp3_2 Ev (1A),(es) +f8: MOVDIR64B Gv,Mdqq (66),(ev) | ENQCMD Gv,Mdqq (F2),(ev) | ENQCMDS Gv,Mdqq (F3),(ev) | URDMSR Rq,Gq (F2),(11B),(ev) | UWRMSR Gq,Rq (F3),(11B),(ev) +f9: MOVDIRI My,Gy (ev) +fe: Grp4 (1A),(ev) +ff: Grp5 (1A),(es) | PUSH2 Bq,Rq (110),(11B),(ev) +EndTable + Table: EVEX map 5 Referrer: AVXcode: 5 diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt index 240ef714b64f..caedb3ef6688 100644 --- a/tools/arch/x86/lib/x86-opcode-map.txt +++ b/tools/arch/x86/lib/x86-opcode-map.txt @@ -23,6 +23,7 @@ # # AVX Superscripts # (ev): this opcode requires EVEX prefix. +# (es): this opcode requires EVEX prefix and is SCALABALE. # (evo): this opcode is changed by EVEX prefix (EVEX opcode) # (v): this opcode requires VEX prefix. # (v1): this opcode only supports 128bit VEX. @@ -929,6 +930,98 @@ df: VAESKEYGEN Vdq,Wdq,Ib (66),(v1) f0: RORX Gy,Ey,Ib (F2),(v) | HRESET Gv,Ib (F3),(000),(11B) EndTable +Table: EVEX map 4 +Referrer: +AVXcode: 4 +00: ADD Eb,Gb (ev) +01: ADD Ev,Gv (es) | ADD Ev,Gv (66),(es) +02: ADD Gb,Eb (ev) +03: ADD Gv,Ev (es) | ADD Gv,Ev (66),(es) +08: OR Eb,Gb (ev) +09: OR Ev,Gv (es) | OR Ev,Gv (66),(es) +0a: OR Gb,Eb (ev) +0b: OR Gv,Ev (es) | OR Gv,Ev (66),(es) +10: ADC Eb,Gb (ev) +11: ADC Ev,Gv (es) | ADC Ev,Gv (66),(es) +12: ADC Gb,Eb (ev) +13: ADC Gv,Ev (es) | ADC Gv,Ev (66),(es) +18: SBB Eb,Gb (ev) +19: SBB Ev,Gv (es) | SBB Ev,Gv (66),(es) +1a: SBB Gb,Eb (ev) +1b: SBB Gv,Ev (es) | SBB Gv,Ev (66),(es) +20: AND Eb,Gb (ev) +21: AND Ev,Gv (es) | AND Ev,Gv (66),(es) +22: AND Gb,Eb (ev) +23: AND Gv,Ev (es) | AND Gv,Ev (66),(es) +24: SHLD Ev,Gv,Ib (es) | SHLD Ev,Gv,Ib (66),(es) +28: SUB Eb,Gb (ev) +29: SUB Ev,Gv (es) | SUB Ev,Gv (66),(es) +2a: SUB Gb,Eb (ev) +2b: SUB Gv,Ev (es) | SUB Gv,Ev (66),(es) +2c: SHRD Ev,Gv,Ib (es) | SHRD Ev,Gv,Ib (66),(es) +30: XOR Eb,Gb (ev) +31: XOR Ev,Gv (es) | XOR Ev,Gv (66),(es) +32: XOR Gb,Eb (ev) +33: XOR Gv,Ev (es) | XOR Gv,Ev (66),(es) +# CCMPSCC instructions are: CCOMB, CCOMBE, CCOMF, CCOML, CCOMLE, CCOMNB, CCOMNBE, CCOMNL, CCOMNLE, +# CCOMNO, CCOMNS, CCOMNZ, CCOMO, CCOMS, CCOMT, CCOMZ +38: CCMPSCC Eb,Gb (ev) +39: CCMPSCC Ev,Gv (es) | CCMPSCC Ev,Gv (66),(es) +3a: CCMPSCC Gv,Ev (ev) +3b: CCMPSCC Gv,Ev (es) | CCMPSCC Gv,Ev (66),(es) +40: CMOVO Gv,Ev (es) | CMOVO Gv,Ev (66),(es) | CFCMOVO Ev,Ev (es) | CFCMOVO Ev,Ev (66),(es) | SETO Eb (F2),(ev) +41: CMOVNO Gv,Ev (es) | CMOVNO Gv,Ev (66),(es) | CFCMOVNO Ev,Ev (es) | CFCMOVNO Ev,Ev (66),(es) | SETNO Eb (F2),(ev) +42: CMOVB Gv,Ev (es) | CMOVB Gv,Ev (66),(es) | CFCMOVB Ev,Ev (es) | CFCMOVB Ev,Ev (66),(es) | SETB Eb (F2),(ev) +43: CMOVNB Gv,Ev (es) | CMOVNB Gv,Ev (66),(es) | CFCMOVNB Ev,Ev (es) | CFCMOVNB Ev,Ev (66),(es) | SETNB Eb (F2),(ev) +44: CMOVZ Gv,Ev (es) | CMOVZ Gv,Ev (66),(es) | CFCMOVZ Ev,Ev (es) | CFCMOVZ Ev,Ev (66),(es) | SETZ Eb (F2),(ev) +45: CMOVNZ Gv,Ev (es) | CMOVNZ Gv,Ev (66),(es) | CFCMOVNZ Ev,Ev (es) | CFCMOVNZ Ev,Ev (66),(es) | SETNZ Eb (F2),(ev) +46: CMOVBE Gv,Ev (es) | CMOVBE Gv,Ev (66),(es) | CFCMOVBE Ev,Ev (es) | CFCMOVBE Ev,Ev (66),(es) | SETBE Eb (F2),(ev) +47: CMOVNBE Gv,Ev (es) | CMOVNBE Gv,Ev (66),(es) | CFCMOVNBE Ev,Ev (es) | CFCMOVNBE Ev,Ev (66),(es) | SETNBE Eb (F2),(ev) +48: CMOVS Gv,Ev (es) | CMOVS Gv,Ev (66),(es) | CFCMOVS Ev,Ev (es) | CFCMOVS Ev,Ev (66),(es) | SETS Eb (F2),(ev) +49: CMOVNS Gv,Ev (es) | CMOVNS Gv,Ev (66),(es) | CFCMOVNS Ev,Ev (es) | CFCMOVNS Ev,Ev (66),(es) | SETNS Eb (F2),(ev) +4a: CMOVP Gv,Ev (es) | CMOVP Gv,Ev (66),(es) | CFCMOVP Ev,Ev (es) | CFCMOVP Ev,Ev (66),(es) | SETP Eb (F2),(ev) +4b: CMOVNP Gv,Ev (es) | CMOVNP Gv,Ev (66),(es) | CFCMOVNP Ev,Ev (es) | CFCMOVNP Ev,Ev (66),(es) | SETNP Eb (F2),(ev) +4c: CMOVL Gv,Ev (es) | CMOVL Gv,Ev (66),(es) | CFCMOVL Ev,Ev (es) | CFCMOVL Ev,Ev (66),(es) | SETL Eb (F2),(ev) +4d: CMOVNL Gv,Ev (es) | CMOVNL Gv,Ev (66),(es) | CFCMOVNL Ev,Ev (es) | CFCMOVNL Ev,Ev (66),(es) | SETNL Eb (F2),(ev) +4e: CMOVLE Gv,Ev (es) | CMOVLE Gv,Ev (66),(es) | CFCMOVLE Ev,Ev (es) | CFCMOVLE Ev,Ev (66),(es) | SETLE Eb (F2),(ev) +4f: CMOVNLE Gv,Ev (es) | CMOVNLE Gv,Ev (66),(es) | CFCMOVNLE Ev,Ev (es) | CFCMOVNLE Ev,Ev (66),(es) | SETNLE Eb (F2),(ev) +60: MOVBE Gv,Ev (es) | MOVBE Gv,Ev (66),(es) +61: MOVBE Ev,Gv (es) | MOVBE Ev,Gv (66),(es) +65: WRUSSD Md,Gd (66),(ev) | WRUSSQ Mq,Gq (66),(ev) +66: ADCX Gy,Ey (66),(ev) | ADOX Gy,Ey (F3),(ev) | WRSSD Md,Gd (ev) | WRSSQ Mq,Gq (66),(ev) +69: IMUL Gv,Ev,Iz (es) | IMUL Gv,Ev,Iz (66),(es) +6b: IMUL Gv,Ev,Ib (es) | IMUL Gv,Ev,Ib (66),(es) +80: Grp1 Eb,Ib (1A),(ev) +81: Grp1 Ev,Iz (1A),(es) +83: Grp1 Ev,Ib (1A),(es) +# CTESTSCC instructions are: CTESTB, CTESTBE, CTESTF, CTESTL, CTESTLE, CTESTNB, CTESTNBE, CTESTNL, +# CTESTNLE, CTESTNO, CTESTNS, CTESTNZ, CTESTO, CTESTS, CTESTT, CTESTZ +84: CTESTSCC (ev) +85: CTESTSCC (es) | CTESTSCC (66),(es) +88: POPCNT Gv,Ev (es) | POPCNT Gv,Ev (66),(es) +8f: POP2 Bq,Rq (000),(11B),(ev) +a5: SHLD Ev,Gv,CL (es) | SHLD Ev,Gv,CL (66),(es) +ad: SHRD Ev,Gv,CL (es) | SHRD Ev,Gv,CL (66),(es) +af: IMUL Gv,Ev (es) | IMUL Gv,Ev (66),(es) +c0: Grp2 Eb,Ib (1A),(ev) +c1: Grp2 Ev,Ib (1A),(es) +d0: Grp2 Eb,1 (1A),(ev) +d1: Grp2 Ev,1 (1A),(es) +d2: Grp2 Eb,CL (1A),(ev) +d3: Grp2 Ev,CL (1A),(es) +f0: CRC32 Gy,Eb (es) | INVEPT Gq,Mdq (F3),(ev) +f1: CRC32 Gy,Ey (es) | CRC32 Gy,Ey (66),(es) | INVVPID Gy,Mdq (F3),(ev) +f2: INVPCID Gy,Mdq (F3),(ev) +f4: TZCNT Gv,Ev (es) | TZCNT Gv,Ev (66),(es) +f5: LZCNT Gv,Ev (es) | LZCNT Gv,Ev (66),(es) +f6: Grp3_1 Eb (1A),(ev) +f7: Grp3_2 Ev (1A),(es) +f8: MOVDIR64B Gv,Mdqq (66),(ev) | ENQCMD Gv,Mdqq (F2),(ev) | ENQCMDS Gv,Mdqq (F3),(ev) | URDMSR Rq,Gq (F2),(11B),(ev) | UWRMSR Gq,Rq (F3),(11B),(ev) +f9: MOVDIRI My,Gy (ev) +fe: Grp4 (1A),(ev) +ff: Grp5 (1A),(es) | PUSH2 Bq,Rq (110),(11B),(ev) +EndTable + Table: EVEX map 5 Referrer: AVXcode: 5 -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 09/10] perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (7 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 08/10] x86/insn: Add support for APX EVEX instructions to the opcode map Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-05-28 18:14 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 10/10] perf tests: Add APX and other new instructions to x86 instruction decoder test Adrian Hunter 2024-06-26 3:59 ` (subset) [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Namhyung Kim 10 siblings, 1 reply; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users JMPABS is 64-bit absolute direct jump instruction, encoded with a mandatory REX2 prefix. JMPABS is designed to be used in the procedure linkage table (PLT) to replace indirect jumps, because it has better performance. In that case the jump target will be amended at run time. To enable Intel PT to follow the code, a TIP packet is always emitted when JMPABS is traced under Intel PT. Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. Decode JMPABS as an indirect jump, because it has an associated TIP packet the same as an indirect jump and the control flow should follow the TIP packet payload, and not assume it is the same as the on-file object code JMPABS target address. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c index c5d57027ec23..4407130d91f8 100644 --- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c @@ -92,6 +92,15 @@ static void intel_pt_insn_decoder(struct insn *insn, op = INTEL_PT_OP_JCC; branch = INTEL_PT_BR_CONDITIONAL; break; + case 0xa1: + if (insn_is_rex2(insn)) { /* jmpabs */ + intel_pt_insn->op = INTEL_PT_OP_JMP; + /* jmpabs causes a TIP packet like an indirect branch */ + intel_pt_insn->branch = INTEL_PT_BR_INDIRECT; + intel_pt_insn->length = insn->length; + return; + } + break; case 0xc2: /* near ret */ case 0xc3: /* near ret */ case 0xca: /* far ret */ -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 09/10] perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder 2024-05-02 10:58 ` [PATCH 09/10] perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder Adrian Hunter @ 2024-05-28 18:14 ` Adrian Hunter 2024-06-18 8:05 ` Adrian Hunter 0 siblings, 1 reply; 18+ messages in thread From: Adrian Hunter @ 2024-05-28 18:14 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users On 2/05/24 13:58, Adrian Hunter wrote: > JMPABS is 64-bit absolute direct jump instruction, encoded with a mandatory > REX2 prefix. JMPABS is designed to be used in the procedure linkage table > (PLT) to replace indirect jumps, because it has better performance. In that > case the jump target will be amended at run time. To enable Intel PT to > follow the code, a TIP packet is always emitted when JMPABS is traced under > Intel PT. > > Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture > Specification for details. > > Decode JMPABS as an indirect jump, because it has an associated TIP packet > the same as an indirect jump and the control flow should follow the TIP > packet payload, and not assume it is the same as the on-file object code > JMPABS target address. > > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Patches 1-8 are in perf-tools-next now, so this and patch 10 could be applied. > --- > tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c > index c5d57027ec23..4407130d91f8 100644 > --- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c > +++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c > @@ -92,6 +92,15 @@ static void intel_pt_insn_decoder(struct insn *insn, > op = INTEL_PT_OP_JCC; > branch = INTEL_PT_BR_CONDITIONAL; > break; > + case 0xa1: > + if (insn_is_rex2(insn)) { /* jmpabs */ > + intel_pt_insn->op = INTEL_PT_OP_JMP; > + /* jmpabs causes a TIP packet like an indirect branch */ > + intel_pt_insn->branch = INTEL_PT_BR_INDIRECT; > + intel_pt_insn->length = insn->length; > + return; > + } > + break; > case 0xc2: /* near ret */ > case 0xc3: /* near ret */ > case 0xca: /* far ret */ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 09/10] perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder 2024-05-28 18:14 ` Adrian Hunter @ 2024-06-18 8:05 ` Adrian Hunter 0 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-06-18 8:05 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users On 28/05/24 21:14, Adrian Hunter wrote: > On 2/05/24 13:58, Adrian Hunter wrote: >> JMPABS is 64-bit absolute direct jump instruction, encoded with a mandatory >> REX2 prefix. JMPABS is designed to be used in the procedure linkage table >> (PLT) to replace indirect jumps, because it has better performance. In that >> case the jump target will be amended at run time. To enable Intel PT to >> follow the code, a TIP packet is always emitted when JMPABS is traced under >> Intel PT. >> >> Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture >> Specification for details. >> >> Decode JMPABS as an indirect jump, because it has an associated TIP packet >> the same as an indirect jump and the control flow should follow the TIP >> packet payload, and not assume it is the same as the on-file object code >> JMPABS target address. >> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> > > Patches 1-8 are in perf-tools-next now, so this and patch 10 > could be applied. Still apply > >> --- >> tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c >> index c5d57027ec23..4407130d91f8 100644 >> --- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c >> +++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c >> @@ -92,6 +92,15 @@ static void intel_pt_insn_decoder(struct insn *insn, >> op = INTEL_PT_OP_JCC; >> branch = INTEL_PT_BR_CONDITIONAL; >> break; >> + case 0xa1: >> + if (insn_is_rex2(insn)) { /* jmpabs */ >> + intel_pt_insn->op = INTEL_PT_OP_JMP; >> + /* jmpabs causes a TIP packet like an indirect branch */ >> + intel_pt_insn->branch = INTEL_PT_BR_INDIRECT; >> + intel_pt_insn->length = insn->length; >> + return; >> + } >> + break; >> case 0xc2: /* near ret */ >> case 0xc3: /* near ret */ >> case 0xca: /* far ret */ > ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 10/10] perf tests: Add APX and other new instructions to x86 instruction decoder test 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (8 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 09/10] perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder Adrian Hunter @ 2024-05-02 10:58 ` Adrian Hunter 2024-06-26 3:59 ` (subset) [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Namhyung Kim 10 siblings, 0 replies; 18+ messages in thread From: Adrian Hunter @ 2024-05-02 10:58 UTC (permalink / raw) To: linux-kernel Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-perf-users Add samples of APX and other new instructions to the 'x86 instruction decoder - new instructions' test. Note the test is only available if the perf tool has been built with EXTRA_TESTS=1. Example: $ make EXTRA_TESTS=1 -C tools/perf $ tools/perf/perf test -F -v 'new ins' |& grep -i 'jmpabs\|popp\|pushp' Decoded ok: d5 00 a1 ef cd ab 90 78 56 34 12 jmpabs $0x1234567890abcdef Decoded ok: d5 08 53 pushp %rbx Decoded ok: d5 18 50 pushp %r16 Decoded ok: d5 19 57 pushp %r31 Decoded ok: d5 19 5f popp %r31 Decoded ok: d5 18 58 popp %r16 Decoded ok: d5 08 5b popp %rbx Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> --- tools/perf/arch/x86/tests/insn-x86-dat-32.c | 116 ++ tools/perf/arch/x86/tests/insn-x86-dat-64.c | 1026 ++++++++++++++++++ tools/perf/arch/x86/tests/insn-x86-dat-src.c | 597 ++++++++++ 3 files changed, 1739 insertions(+) diff --git a/tools/perf/arch/x86/tests/insn-x86-dat-32.c b/tools/perf/arch/x86/tests/insn-x86-dat-32.c index ba429cadb18f..ce9645edaf68 100644 --- a/tools/perf/arch/x86/tests/insn-x86-dat-32.c +++ b/tools/perf/arch/x86/tests/insn-x86-dat-32.c @@ -3107,6 +3107,122 @@ "62 f5 7c 08 2e ca \tvucomish %xmm2,%xmm1",}, {{0x62, 0xf5, 0x7c, 0x08, 0x2e, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", "62 f5 7c 08 2e 8c c8 78 56 34 12 \tvucomish 0x12345678(%eax,%ecx,8),%xmm1",}, +{{0xf3, 0x0f, 0x38, 0xdc, 0xd1, }, 5, 0, "", "", +"f3 0f 38 dc d1 \tloadiwkey %xmm1,%xmm2",}, +{{0xf3, 0x0f, 0x38, 0xfa, 0xd0, }, 5, 0, "", "", +"f3 0f 38 fa d0 \tencodekey128 %eax,%edx",}, +{{0xf3, 0x0f, 0x38, 0xfb, 0xd0, }, 5, 0, "", "", +"f3 0f 38 fb d0 \tencodekey256 %eax,%edx",}, +{{0xf3, 0x0f, 0x38, 0xdc, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 dc 5a 77 \taesenc128kl 0x77(%edx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xde, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 de 5a 77 \taesenc256kl 0x77(%edx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xdd, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 dd 5a 77 \taesdec128kl 0x77(%edx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xdf, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 df 5a 77 \taesdec256kl 0x77(%edx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x42, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 42 77 \taesencwide128kl 0x77(%edx)",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x52, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 52 77 \taesencwide256kl 0x77(%edx)",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x4a, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 4a 77 \taesdecwide128kl 0x77(%edx)",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 5a 77 \taesdecwide256kl 0x77(%edx)",}, +{{0x0f, 0x38, 0xfc, 0x08, }, 4, 0, "", "", +"0f 38 fc 08 \taadd %ecx,(%eax)",}, +{{0x0f, 0x38, 0xfc, 0x15, 0x78, 0x56, 0x34, 0x12, }, 8, 0, "", "", +"0f 38 fc 15 78 56 34 12 \taadd %edx,0x12345678",}, +{{0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "", +"0f 38 fc 94 c8 78 56 34 12 \taadd %edx,0x12345678(%eax,%ecx,8)",}, +{{0x66, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"66 0f 38 fc 08 \taand %ecx,(%eax)",}, +{{0x66, 0x0f, 0x38, 0xfc, 0x15, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "", +"66 0f 38 fc 15 78 56 34 12 \taand %edx,0x12345678",}, +{{0x66, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"66 0f 38 fc 94 c8 78 56 34 12 \taand %edx,0x12345678(%eax,%ecx,8)",}, +{{0xf2, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"f2 0f 38 fc 08 \taor %ecx,(%eax)",}, +{{0xf2, 0x0f, 0x38, 0xfc, 0x15, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "", +"f2 0f 38 fc 15 78 56 34 12 \taor %edx,0x12345678",}, +{{0xf2, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"f2 0f 38 fc 94 c8 78 56 34 12 \taor %edx,0x12345678(%eax,%ecx,8)",}, +{{0xf3, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"f3 0f 38 fc 08 \taxor %ecx,(%eax)",}, +{{0xf3, 0x0f, 0x38, 0xfc, 0x15, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "", +"f3 0f 38 fc 15 78 56 34 12 \taxor %edx,0x12345678",}, +{{0xf3, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"f3 0f 38 fc 94 c8 78 56 34 12 \taxor %edx,0x12345678(%eax,%ecx,8)",}, +{{0xc4, 0xe2, 0x7a, 0xb1, 0x31, }, 5, 0, "", "", +"c4 e2 7a b1 31 \tvbcstnebf162ps (%ecx),%xmm6",}, +{{0xc4, 0xe2, 0x79, 0xb1, 0x31, }, 5, 0, "", "", +"c4 e2 79 b1 31 \tvbcstnesh2ps (%ecx),%xmm6",}, +{{0xc4, 0xe2, 0x7a, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 7a b0 31 \tvcvtneebf162ps (%ecx),%xmm6",}, +{{0xc4, 0xe2, 0x79, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 79 b0 31 \tvcvtneeph2ps (%ecx),%xmm6",}, +{{0xc4, 0xe2, 0x7b, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 7b b0 31 \tvcvtneobf162ps (%ecx),%xmm6",}, +{{0xc4, 0xe2, 0x78, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 78 b0 31 \tvcvtneoph2ps (%ecx),%xmm6",}, +{{0x62, 0xf2, 0x7e, 0x08, 0x72, 0xf1, }, 6, 0, "", "", +"62 f2 7e 08 72 f1 \tvcvtneps2bf16 %xmm1,%xmm6",}, +{{0xc4, 0xe2, 0x6b, 0x50, 0xd9, }, 5, 0, "", "", +"c4 e2 6b 50 d9 \tvpdpbssd %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6b, 0x51, 0xd9, }, 5, 0, "", "", +"c4 e2 6b 51 d9 \tvpdpbssds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0x50, 0xd9, }, 5, 0, "", "", +"c4 e2 6a 50 d9 \tvpdpbsud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0x51, 0xd9, }, 5, 0, "", "", +"c4 e2 6a 51 d9 \tvpdpbsuds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0x50, 0xd9, }, 5, 0, "", "", +"c4 e2 68 50 d9 \tvpdpbuud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0x51, 0xd9, }, 5, 0, "", "", +"c4 e2 68 51 d9 \tvpdpbuuds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0xd2, 0xd9, }, 5, 0, "", "", +"c4 e2 6a d2 d9 \tvpdpwsud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0xd3, 0xd9, }, 5, 0, "", "", +"c4 e2 6a d3 d9 \tvpdpwsuds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x69, 0xd2, 0xd9, }, 5, 0, "", "", +"c4 e2 69 d2 d9 \tvpdpwusd %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x69, 0xd3, 0xd9, }, 5, 0, "", "", +"c4 e2 69 d3 d9 \tvpdpwusds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0xd2, 0xd9, }, 5, 0, "", "", +"c4 e2 68 d2 d9 \tvpdpwuud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0xd3, 0xd9, }, 5, 0, "", "", +"c4 e2 68 d3 d9 \tvpdpwuuds %xmm1,%xmm2,%xmm3",}, +{{0x62, 0xf2, 0xed, 0x08, 0xb5, 0xd9, }, 6, 0, "", "", +"62 f2 ed 08 b5 d9 \tvpmadd52huq %xmm1,%xmm2,%xmm3",}, +{{0x62, 0xf2, 0xed, 0x08, 0xb4, 0xd9, }, 6, 0, "", "", +"62 f2 ed 08 b4 d9 \tvpmadd52luq %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x7f, 0xcc, 0xd1, }, 5, 0, "", "", +"c4 e2 7f cc d1 \tvsha512msg1 %xmm1,%ymm2",}, +{{0xc4, 0xe2, 0x7f, 0xcd, 0xd1, }, 5, 0, "", "", +"c4 e2 7f cd d1 \tvsha512msg2 %ymm1,%ymm2",}, +{{0xc4, 0xe2, 0x6f, 0xcb, 0xd9, }, 5, 0, "", "", +"c4 e2 6f cb d9 \tvsha512rnds2 %xmm1,%ymm2,%ymm3",}, +{{0xc4, 0xe2, 0x68, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 68 da d9 \tvsm3msg1 %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x69, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 69 da d9 \tvsm3msg2 %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe3, 0x69, 0xde, 0xd9, 0xa1, }, 6, 0, "", "", +"c4 e3 69 de d9 a1 \tvsm3rnds2 $0xa1,%xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 6a da d9 \tvsm4key4 %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6b, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 6b da d9 \tvsm4rnds4 %xmm1,%xmm2,%xmm3",}, +{{0x0f, 0x0d, 0x00, }, 3, 0, "", "", +"0f 0d 00 \tprefetch (%eax)",}, +{{0x0f, 0x18, 0x08, }, 3, 0, "", "", +"0f 18 08 \tprefetcht0 (%eax)",}, +{{0x0f, 0x18, 0x10, }, 3, 0, "", "", +"0f 18 10 \tprefetcht1 (%eax)",}, +{{0x0f, 0x18, 0x18, }, 3, 0, "", "", +"0f 18 18 \tprefetcht2 (%eax)",}, +{{0x0f, 0x18, 0x00, }, 3, 0, "", "", +"0f 18 00 \tprefetchnta (%eax)",}, +{{0x0f, 0x01, 0xc6, }, 3, 0, "", "", +"0f 01 c6 \twrmsrns",}, {{0xf3, 0x0f, 0x3a, 0xf0, 0xc0, 0x00, }, 6, 0, "", "", "f3 0f 3a f0 c0 00 \threset $0x0",}, {{0x0f, 0x01, 0xe8, }, 3, 0, "", "", diff --git a/tools/perf/arch/x86/tests/insn-x86-dat-64.c b/tools/perf/arch/x86/tests/insn-x86-dat-64.c index 3a47e98fec33..3881fe89df8b 100644 --- a/tools/perf/arch/x86/tests/insn-x86-dat-64.c +++ b/tools/perf/arch/x86/tests/insn-x86-dat-64.c @@ -3877,6 +3877,1032 @@ "62 f5 7c 08 2e 8c c8 78 56 34 12 \tvucomish 0x12345678(%rax,%rcx,8),%xmm1",}, {{0x67, 0x62, 0xf5, 0x7c, 0x08, 0x2e, 0x8c, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 12, 0, "", "", "67 62 f5 7c 08 2e 8c c8 78 56 34 12 \tvucomish 0x12345678(%eax,%ecx,8),%xmm1",}, +{{0xf3, 0x0f, 0x38, 0xdc, 0xd1, }, 5, 0, "", "", +"f3 0f 38 dc d1 \tloadiwkey %xmm1,%xmm2",}, +{{0xf3, 0x0f, 0x38, 0xfa, 0xd0, }, 5, 0, "", "", +"f3 0f 38 fa d0 \tencodekey128 %eax,%edx",}, +{{0xf3, 0x0f, 0x38, 0xfb, 0xd0, }, 5, 0, "", "", +"f3 0f 38 fb d0 \tencodekey256 %eax,%edx",}, +{{0xf3, 0x0f, 0x38, 0xdc, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 dc 5a 77 \taesenc128kl 0x77(%rdx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xde, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 de 5a 77 \taesenc256kl 0x77(%rdx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xdd, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 dd 5a 77 \taesdec128kl 0x77(%rdx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xdf, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 df 5a 77 \taesdec256kl 0x77(%rdx),%xmm3",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x42, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 42 77 \taesencwide128kl 0x77(%rdx)",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x52, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 52 77 \taesencwide256kl 0x77(%rdx)",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x4a, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 4a 77 \taesdecwide128kl 0x77(%rdx)",}, +{{0xf3, 0x0f, 0x38, 0xd8, 0x5a, 0x77, }, 6, 0, "", "", +"f3 0f 38 d8 5a 77 \taesdecwide256kl 0x77(%rdx)",}, +{{0x0f, 0x38, 0xfc, 0x08, }, 4, 0, "", "", +"0f 38 fc 08 \taadd %ecx,(%rax)",}, +{{0x41, 0x0f, 0x38, 0xfc, 0x10, }, 5, 0, "", "", +"41 0f 38 fc 10 \taadd %edx,(%r8)",}, +{{0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "", +"0f 38 fc 94 c8 78 56 34 12 \taadd %edx,0x12345678(%rax,%rcx,8)",}, +{{0x41, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"41 0f 38 fc 94 c8 78 56 34 12 \taadd %edx,0x12345678(%r8,%rcx,8)",}, +{{0x48, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"48 0f 38 fc 08 \taadd %rcx,(%rax)",}, +{{0x49, 0x0f, 0x38, 0xfc, 0x10, }, 5, 0, "", "", +"49 0f 38 fc 10 \taadd %rdx,(%r8)",}, +{{0x48, 0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"48 0f 38 fc 14 25 78 56 34 12 \taadd %rdx,0x12345678",}, +{{0x48, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"48 0f 38 fc 94 c8 78 56 34 12 \taadd %rdx,0x12345678(%rax,%rcx,8)",}, +{{0x49, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"49 0f 38 fc 94 c8 78 56 34 12 \taadd %rdx,0x12345678(%r8,%rcx,8)",}, +{{0x66, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"66 0f 38 fc 08 \taand %ecx,(%rax)",}, +{{0x66, 0x41, 0x0f, 0x38, 0xfc, 0x10, }, 6, 0, "", "", +"66 41 0f 38 fc 10 \taand %edx,(%r8)",}, +{{0x66, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"66 0f 38 fc 94 c8 78 56 34 12 \taand %edx,0x12345678(%rax,%rcx,8)",}, +{{0x66, 0x41, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"66 41 0f 38 fc 94 c8 78 56 34 12 \taand %edx,0x12345678(%r8,%rcx,8)",}, +{{0x66, 0x48, 0x0f, 0x38, 0xfc, 0x08, }, 6, 0, "", "", +"66 48 0f 38 fc 08 \taand %rcx,(%rax)",}, +{{0x66, 0x49, 0x0f, 0x38, 0xfc, 0x10, }, 6, 0, "", "", +"66 49 0f 38 fc 10 \taand %rdx,(%r8)",}, +{{0x66, 0x48, 0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"66 48 0f 38 fc 14 25 78 56 34 12 \taand %rdx,0x12345678",}, +{{0x66, 0x48, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"66 48 0f 38 fc 94 c8 78 56 34 12 \taand %rdx,0x12345678(%rax,%rcx,8)",}, +{{0x66, 0x49, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"66 49 0f 38 fc 94 c8 78 56 34 12 \taand %rdx,0x12345678(%r8,%rcx,8)",}, +{{0xf2, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"f2 0f 38 fc 08 \taor %ecx,(%rax)",}, +{{0xf2, 0x41, 0x0f, 0x38, 0xfc, 0x10, }, 6, 0, "", "", +"f2 41 0f 38 fc 10 \taor %edx,(%r8)",}, +{{0xf2, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"f2 0f 38 fc 94 c8 78 56 34 12 \taor %edx,0x12345678(%rax,%rcx,8)",}, +{{0xf2, 0x41, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f2 41 0f 38 fc 94 c8 78 56 34 12 \taor %edx,0x12345678(%r8,%rcx,8)",}, +{{0xf2, 0x48, 0x0f, 0x38, 0xfc, 0x08, }, 6, 0, "", "", +"f2 48 0f 38 fc 08 \taor %rcx,(%rax)",}, +{{0xf2, 0x49, 0x0f, 0x38, 0xfc, 0x10, }, 6, 0, "", "", +"f2 49 0f 38 fc 10 \taor %rdx,(%r8)",}, +{{0xf2, 0x48, 0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f2 48 0f 38 fc 14 25 78 56 34 12 \taor %rdx,0x12345678",}, +{{0xf2, 0x48, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f2 48 0f 38 fc 94 c8 78 56 34 12 \taor %rdx,0x12345678(%rax,%rcx,8)",}, +{{0xf2, 0x49, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f2 49 0f 38 fc 94 c8 78 56 34 12 \taor %rdx,0x12345678(%r8,%rcx,8)",}, +{{0xf3, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"f3 0f 38 fc 08 \taxor %ecx,(%rax)",}, +{{0xf3, 0x41, 0x0f, 0x38, 0xfc, 0x10, }, 6, 0, "", "", +"f3 41 0f 38 fc 10 \taxor %edx,(%r8)",}, +{{0xf3, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"f3 0f 38 fc 94 c8 78 56 34 12 \taxor %edx,0x12345678(%rax,%rcx,8)",}, +{{0xf3, 0x41, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f3 41 0f 38 fc 94 c8 78 56 34 12 \taxor %edx,0x12345678(%r8,%rcx,8)",}, +{{0xf3, 0x48, 0x0f, 0x38, 0xfc, 0x08, }, 6, 0, "", "", +"f3 48 0f 38 fc 08 \taxor %rcx,(%rax)",}, +{{0xf3, 0x49, 0x0f, 0x38, 0xfc, 0x10, }, 6, 0, "", "", +"f3 49 0f 38 fc 10 \taxor %rdx,(%r8)",}, +{{0xf3, 0x48, 0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f3 48 0f 38 fc 14 25 78 56 34 12 \taxor %rdx,0x12345678",}, +{{0xf3, 0x48, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f3 48 0f 38 fc 94 c8 78 56 34 12 \taxor %rdx,0x12345678(%rax,%rcx,8)",}, +{{0xf3, 0x49, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"f3 49 0f 38 fc 94 c8 78 56 34 12 \taxor %rdx,0x12345678(%r8,%rcx,8)",}, +{{0xc4, 0xc2, 0x61, 0xe6, 0x09, }, 5, 0, "", "", +"c4 c2 61 e6 09 \tcmpbexadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe2, 0x09, }, 5, 0, "", "", +"c4 c2 61 e2 09 \tcmpbxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xee, 0x09, }, 5, 0, "", "", +"c4 c2 61 ee 09 \tcmplexadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xec, 0x09, }, 5, 0, "", "", +"c4 c2 61 ec 09 \tcmplxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe7, 0x09, }, 5, 0, "", "", +"c4 c2 61 e7 09 \tcmpnbexadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe3, 0x09, }, 5, 0, "", "", +"c4 c2 61 e3 09 \tcmpnbxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xef, 0x09, }, 5, 0, "", "", +"c4 c2 61 ef 09 \tcmpnlexadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xed, 0x09, }, 5, 0, "", "", +"c4 c2 61 ed 09 \tcmpnlxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe1, 0x09, }, 5, 0, "", "", +"c4 c2 61 e1 09 \tcmpnoxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xeb, 0x09, }, 5, 0, "", "", +"c4 c2 61 eb 09 \tcmpnpxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe9, 0x09, }, 5, 0, "", "", +"c4 c2 61 e9 09 \tcmpnsxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe5, 0x09, }, 5, 0, "", "", +"c4 c2 61 e5 09 \tcmpnzxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe0, 0x09, }, 5, 0, "", "", +"c4 c2 61 e0 09 \tcmpoxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xea, 0x09, }, 5, 0, "", "", +"c4 c2 61 ea 09 \tcmppxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe8, 0x09, }, 5, 0, "", "", +"c4 c2 61 e8 09 \tcmpsxadd %ebx,%ecx,(%r9)",}, +{{0xc4, 0xc2, 0x61, 0xe4, 0x09, }, 5, 0, "", "", +"c4 c2 61 e4 09 \tcmpzxadd %ebx,%ecx,(%r9)",}, +{{0x0f, 0x0d, 0x00, }, 3, 0, "", "", +"0f 0d 00 \tprefetch (%rax)",}, +{{0x0f, 0x18, 0x08, }, 3, 0, "", "", +"0f 18 08 \tprefetcht0 (%rax)",}, +{{0x0f, 0x18, 0x10, }, 3, 0, "", "", +"0f 18 10 \tprefetcht1 (%rax)",}, +{{0x0f, 0x18, 0x18, }, 3, 0, "", "", +"0f 18 18 \tprefetcht2 (%rax)",}, +{{0x0f, 0x18, 0x00, }, 3, 0, "", "", +"0f 18 00 \tprefetchnta (%rax)",}, +{{0x0f, 0x18, 0x3d, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "", +"0f 18 3d 78 56 34 12 \tprefetchit0 0x12345678(%rip) # 1234924e <main+0x1234924e>",}, +{{0x0f, 0x18, 0x35, 0x78, 0x56, 0x34, 0x12, }, 7, 0, "", "", +"0f 18 35 78 56 34 12 \tprefetchit1 0x12345678(%rip) # 12349255 <main+0x12349255>",}, +{{0xf2, 0x0f, 0x01, 0xc6, }, 4, 0, "", "", +"f2 0f 01 c6 \trdmsrlist",}, +{{0xf3, 0x0f, 0x01, 0xc6, }, 4, 0, "", "", +"f3 0f 01 c6 \twrmsrlist",}, +{{0xf2, 0x0f, 0x38, 0xf8, 0xd0, }, 5, 0, "", "", +"f2 0f 38 f8 d0 \turdmsr %rdx,%rax",}, +{{0x62, 0xfc, 0x7f, 0x08, 0xf8, 0xd6, }, 6, 0, "", "", +"62 fc 7f 08 f8 d6 \turdmsr %rdx,%r22",}, +{{0xc4, 0xc7, 0x7b, 0xf8, 0xc4, 0x7f, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"c4 c7 7b f8 c4 7f 00 00 00 \turdmsr $0x7f,%r12",}, +{{0xf3, 0x0f, 0x38, 0xf8, 0xd0, }, 5, 0, "", "", +"f3 0f 38 f8 d0 \tuwrmsr %rax,%rdx",}, +{{0x62, 0xfc, 0x7e, 0x08, 0xf8, 0xd6, }, 6, 0, "", "", +"62 fc 7e 08 f8 d6 \tuwrmsr %r22,%rdx",}, +{{0xc4, 0xc7, 0x7a, 0xf8, 0xc4, 0x7f, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"c4 c7 7a f8 c4 7f 00 00 00 \tuwrmsr %r12,$0x7f",}, +{{0xc4, 0xe2, 0x7a, 0xb1, 0x31, }, 5, 0, "", "", +"c4 e2 7a b1 31 \tvbcstnebf162ps (%rcx),%xmm6",}, +{{0xc4, 0xe2, 0x79, 0xb1, 0x31, }, 5, 0, "", "", +"c4 e2 79 b1 31 \tvbcstnesh2ps (%rcx),%xmm6",}, +{{0xc4, 0xe2, 0x7a, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 7a b0 31 \tvcvtneebf162ps (%rcx),%xmm6",}, +{{0xc4, 0xe2, 0x79, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 79 b0 31 \tvcvtneeph2ps (%rcx),%xmm6",}, +{{0xc4, 0xe2, 0x7b, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 7b b0 31 \tvcvtneobf162ps (%rcx),%xmm6",}, +{{0xc4, 0xe2, 0x78, 0xb0, 0x31, }, 5, 0, "", "", +"c4 e2 78 b0 31 \tvcvtneoph2ps (%rcx),%xmm6",}, +{{0x62, 0xf2, 0x7e, 0x08, 0x72, 0xf1, }, 6, 0, "", "", +"62 f2 7e 08 72 f1 \tvcvtneps2bf16 %xmm1,%xmm6",}, +{{0xf2, 0x0f, 0x01, 0xca, }, 4, 0, "erets", "indirect", +"f2 0f 01 ca \terets",}, +{{0xf3, 0x0f, 0x01, 0xca, }, 4, 0, "eretu", "indirect", +"f3 0f 01 ca \teretu",}, +{{0xc4, 0xe2, 0x71, 0x6c, 0xda, }, 5, 0, "", "", +"c4 e2 71 6c da \ttcmmimfp16ps %tmm1,%tmm2,%tmm3",}, +{{0xc4, 0xe2, 0x70, 0x6c, 0xda, }, 5, 0, "", "", +"c4 e2 70 6c da \ttcmmrlfp16ps %tmm1,%tmm2,%tmm3",}, +{{0xc4, 0xe2, 0x73, 0x5c, 0xda, }, 5, 0, "", "", +"c4 e2 73 5c da \ttdpfp16ps %tmm1,%tmm2,%tmm3",}, +{{0xd5, 0x10, 0xf6, 0xc2, 0x05, }, 5, 0, "", "", +"d5 10 f6 c2 05 \ttest $0x5,%r18b",}, +{{0xd5, 0x10, 0xf7, 0xc2, 0x05, 0x00, 0x00, 0x00, }, 8, 0, "", "", +"d5 10 f7 c2 05 00 00 00 \ttest $0x5,%r18d",}, +{{0xd5, 0x18, 0xf7, 0xc2, 0x05, 0x00, 0x00, 0x00, }, 8, 0, "", "", +"d5 18 f7 c2 05 00 00 00 \ttest $0x5,%r18",}, +{{0x66, 0xd5, 0x10, 0xf7, 0xc2, 0x05, 0x00, }, 7, 0, "", "", +"66 d5 10 f7 c2 05 00 \ttest $0x5,%r18w",}, +{{0x44, 0x0f, 0xaf, 0xf0, }, 4, 0, "", "", +"44 0f af f0 \timul %eax,%r14d",}, +{{0xd5, 0xc0, 0xaf, 0xc8, }, 4, 0, "", "", +"d5 c0 af c8 \timul %eax,%r17d",}, +{{0xd5, 0x90, 0x62, 0x12, }, 4, 0, "", "", +"d5 90 62 12 \tpunpckldq %mm2,(%r18)",}, +{{0xd5, 0x40, 0x8d, 0x00, }, 4, 0, "", "", +"d5 40 8d 00 \tlea (%rax),%r16d",}, +{{0xd5, 0x44, 0x8d, 0x38, }, 4, 0, "", "", +"d5 44 8d 38 \tlea (%rax),%r31d",}, +{{0xd5, 0x20, 0x8d, 0x04, 0x05, 0x00, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"d5 20 8d 04 05 00 00 00 00 \tlea 0x0(,%r16,1),%eax",}, +{{0xd5, 0x22, 0x8d, 0x04, 0x3d, 0x00, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"d5 22 8d 04 3d 00 00 00 00 \tlea 0x0(,%r31,1),%eax",}, +{{0xd5, 0x10, 0x8d, 0x00, }, 4, 0, "", "", +"d5 10 8d 00 \tlea (%r16),%eax",}, +{{0xd5, 0x11, 0x8d, 0x07, }, 4, 0, "", "", +"d5 11 8d 07 \tlea (%r31),%eax",}, +{{0x4c, 0x8d, 0x38, }, 3, 0, "", "", +"4c 8d 38 \tlea (%rax),%r15",}, +{{0xd5, 0x48, 0x8d, 0x00, }, 4, 0, "", "", +"d5 48 8d 00 \tlea (%rax),%r16",}, +{{0x49, 0x8d, 0x07, }, 3, 0, "", "", +"49 8d 07 \tlea (%r15),%rax",}, +{{0xd5, 0x18, 0x8d, 0x00, }, 4, 0, "", "", +"d5 18 8d 00 \tlea (%r16),%rax",}, +{{0x4a, 0x8d, 0x04, 0x3d, 0x00, 0x00, 0x00, 0x00, }, 8, 0, "", "", +"4a 8d 04 3d 00 00 00 00 \tlea 0x0(,%r15,1),%rax",}, +{{0xd5, 0x28, 0x8d, 0x04, 0x05, 0x00, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"d5 28 8d 04 05 00 00 00 00 \tlea 0x0(,%r16,1),%rax",}, +{{0xd5, 0x1c, 0x03, 0x00, }, 4, 0, "", "", +"d5 1c 03 00 \tadd (%r16),%r8",}, +{{0xd5, 0x1c, 0x03, 0x38, }, 4, 0, "", "", +"d5 1c 03 38 \tadd (%r16),%r15",}, +{{0xd5, 0x4a, 0x8b, 0x04, 0x0d, 0x00, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"d5 4a 8b 04 0d 00 00 00 00 \tmov 0x0(,%r9,1),%r16",}, +{{0xd5, 0x4a, 0x8b, 0x04, 0x35, 0x00, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"d5 4a 8b 04 35 00 00 00 00 \tmov 0x0(,%r14,1),%r16",}, +{{0xd5, 0x4d, 0x2b, 0x3a, }, 4, 0, "", "", +"d5 4d 2b 3a \tsub (%r10),%r31",}, +{{0xd5, 0x4d, 0x2b, 0x7d, 0x00, }, 5, 0, "", "", +"d5 4d 2b 7d 00 \tsub 0x0(%r13),%r31",}, +{{0xd5, 0x30, 0x8d, 0x44, 0x28, 0x01, }, 6, 0, "", "", +"d5 30 8d 44 28 01 \tlea 0x1(%r16,%r21,1),%eax",}, +{{0xd5, 0x76, 0x8d, 0x7c, 0x10, 0x01, }, 6, 0, "", "", +"d5 76 8d 7c 10 01 \tlea 0x1(%r16,%r26,1),%r31d",}, +{{0xd5, 0x12, 0x8d, 0x84, 0x0d, 0x81, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"d5 12 8d 84 0d 81 00 00 00 \tlea 0x81(%r21,%r9,1),%eax",}, +{{0xd5, 0x57, 0x8d, 0xbc, 0x0a, 0x81, 0x00, 0x00, 0x00, }, 9, 0, "", "", +"d5 57 8d bc 0a 81 00 00 00 \tlea 0x81(%r26,%r9,1),%r31d",}, +{{0xd5, 0x00, 0xa1, 0xef, 0xcd, 0xab, 0x90, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "jmp", "indirect", +"d5 00 a1 ef cd ab 90 78 56 34 12 \tjmpabs $0x1234567890abcdef",}, +{{0xd5, 0x08, 0x53, }, 3, 0, "", "", +"d5 08 53 \tpushp %rbx",}, +{{0xd5, 0x18, 0x50, }, 3, 0, "", "", +"d5 18 50 \tpushp %r16",}, +{{0xd5, 0x19, 0x57, }, 3, 0, "", "", +"d5 19 57 \tpushp %r31",}, +{{0xd5, 0x19, 0x5f, }, 3, 0, "", "", +"d5 19 5f \tpopp %r31",}, +{{0xd5, 0x18, 0x58, }, 3, 0, "", "", +"d5 18 58 \tpopp %r16",}, +{{0xd5, 0x08, 0x5b, }, 3, 0, "", "", +"d5 08 5b \tpopp %rbx",}, +{{0x62, 0x72, 0x34, 0x00, 0xf7, 0xd2, }, 6, 0, "", "", +"62 72 34 00 f7 d2 \tbextr %r25d,%edx,%r10d",}, +{{0x62, 0xda, 0x34, 0x00, 0xf7, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 34 00 f7 94 87 23 01 00 00 \tbextr %r25d,0x123(%r31,%rax,4),%edx",}, +{{0x62, 0x52, 0x84, 0x00, 0xf7, 0xdf, }, 6, 0, "", "", +"62 52 84 00 f7 df \tbextr %r31,%r15,%r11",}, +{{0x62, 0x5a, 0x84, 0x00, 0xf7, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 84 00 f7 bc 87 23 01 00 00 \tbextr %r31,0x123(%r31,%rax,4),%r15",}, +{{0x62, 0xda, 0x6c, 0x08, 0xf3, 0xd9, }, 6, 0, "", "", +"62 da 6c 08 f3 d9 \tblsi %r25d,%edx",}, +{{0x62, 0xda, 0x84, 0x08, 0xf3, 0xdf, }, 6, 0, "", "", +"62 da 84 08 f3 df \tblsi %r31,%r15",}, +{{0x62, 0xda, 0x34, 0x00, 0xf3, 0x9c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 34 00 f3 9c 87 23 01 00 00 \tblsi 0x123(%r31,%rax,4),%r25d",}, +{{0x62, 0xda, 0x84, 0x00, 0xf3, 0x9c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 84 00 f3 9c 87 23 01 00 00 \tblsi 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0xda, 0x6c, 0x08, 0xf3, 0xd1, }, 6, 0, "", "", +"62 da 6c 08 f3 d1 \tblsmsk %r25d,%edx",}, +{{0x62, 0xda, 0x84, 0x08, 0xf3, 0xd7, }, 6, 0, "", "", +"62 da 84 08 f3 d7 \tblsmsk %r31,%r15",}, +{{0x62, 0xda, 0x34, 0x00, 0xf3, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 34 00 f3 94 87 23 01 00 00 \tblsmsk 0x123(%r31,%rax,4),%r25d",}, +{{0x62, 0xda, 0x84, 0x00, 0xf3, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 84 00 f3 94 87 23 01 00 00 \tblsmsk 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0xda, 0x6c, 0x08, 0xf3, 0xc9, }, 6, 0, "", "", +"62 da 6c 08 f3 c9 \tblsr %r25d,%edx",}, +{{0x62, 0xda, 0x84, 0x08, 0xf3, 0xcf, }, 6, 0, "", "", +"62 da 84 08 f3 cf \tblsr %r31,%r15",}, +{{0x62, 0xda, 0x34, 0x00, 0xf3, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 34 00 f3 8c 87 23 01 00 00 \tblsr 0x123(%r31,%rax,4),%r25d",}, +{{0x62, 0xda, 0x84, 0x00, 0xf3, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 84 00 f3 8c 87 23 01 00 00 \tblsr 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0x72, 0x34, 0x00, 0xf5, 0xd2, }, 6, 0, "", "", +"62 72 34 00 f5 d2 \tbzhi %r25d,%edx,%r10d",}, +{{0x62, 0xda, 0x34, 0x00, 0xf5, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 34 00 f5 94 87 23 01 00 00 \tbzhi %r25d,0x123(%r31,%rax,4),%edx",}, +{{0x62, 0x52, 0x84, 0x00, 0xf5, 0xdf, }, 6, 0, "", "", +"62 52 84 00 f5 df \tbzhi %r31,%r15,%r11",}, +{{0x62, 0x5a, 0x84, 0x00, 0xf5, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 84 00 f5 bc 87 23 01 00 00 \tbzhi %r31,0x123(%r31,%rax,4),%r15",}, +{{0x62, 0xda, 0x35, 0x00, 0xe6, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e6 94 87 23 01 00 00 \tcmpbexadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe6, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e6 bc 87 23 01 00 00 \tcmpbexadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe2, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e2 94 87 23 01 00 00 \tcmpbxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe2, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e2 bc 87 23 01 00 00 \tcmpbxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xec, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 ec 94 87 23 01 00 00 \tcmplxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xec, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 ec bc 87 23 01 00 00 \tcmplxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe7, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e7 94 87 23 01 00 00 \tcmpnbexadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe7, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e7 bc 87 23 01 00 00 \tcmpnbexadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe3, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e3 94 87 23 01 00 00 \tcmpnbxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe3, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e3 bc 87 23 01 00 00 \tcmpnbxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xef, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 ef 94 87 23 01 00 00 \tcmpnlexadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xef, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 ef bc 87 23 01 00 00 \tcmpnlexadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xed, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 ed 94 87 23 01 00 00 \tcmpnlxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xed, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 ed bc 87 23 01 00 00 \tcmpnlxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe1, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e1 94 87 23 01 00 00 \tcmpnoxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe1, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e1 bc 87 23 01 00 00 \tcmpnoxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xeb, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 eb 94 87 23 01 00 00 \tcmpnpxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xeb, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 eb bc 87 23 01 00 00 \tcmpnpxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe9, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e9 94 87 23 01 00 00 \tcmpnsxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe9, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e9 bc 87 23 01 00 00 \tcmpnsxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe5, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e5 94 87 23 01 00 00 \tcmpnzxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe5, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e5 bc 87 23 01 00 00 \tcmpnzxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe0, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e0 94 87 23 01 00 00 \tcmpoxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe0, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e0 bc 87 23 01 00 00 \tcmpoxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xea, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 ea 94 87 23 01 00 00 \tcmppxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xea, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 ea bc 87 23 01 00 00 \tcmppxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe8, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e8 94 87 23 01 00 00 \tcmpsxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe8, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e8 bc 87 23 01 00 00 \tcmpsxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x35, 0x00, 0xe4, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 e4 94 87 23 01 00 00 \tcmpzxadd %r25d,%edx,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x85, 0x00, 0xe4, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 e4 bc 87 23 01 00 00 \tcmpzxadd %r31,%r15,0x123(%r31,%rax,4)",}, +{{0x62, 0xcc, 0xfc, 0x08, 0xf1, 0xf7, }, 6, 0, "", "", +"62 cc fc 08 f1 f7 \tcrc32 %r31,%r22",}, +{{0x62, 0xcc, 0xfc, 0x08, 0xf1, 0x37, }, 6, 0, "", "", +"62 cc fc 08 f1 37 \tcrc32q (%r31),%r22",}, +{{0x62, 0xec, 0xfc, 0x08, 0xf0, 0xcb, }, 6, 0, "", "", +"62 ec fc 08 f0 cb \tcrc32 %r19b,%r17",}, +{{0x62, 0xec, 0x7c, 0x08, 0xf0, 0xeb, }, 6, 0, "", "", +"62 ec 7c 08 f0 eb \tcrc32 %r19b,%r21d",}, +{{0x62, 0xfc, 0x7c, 0x08, 0xf0, 0x1b, }, 6, 0, "", "", +"62 fc 7c 08 f0 1b \tcrc32b (%r19),%ebx",}, +{{0x62, 0xcc, 0x7c, 0x08, 0xf1, 0xff, }, 6, 0, "", "", +"62 cc 7c 08 f1 ff \tcrc32 %r31d,%r23d",}, +{{0x62, 0xcc, 0x7c, 0x08, 0xf1, 0x3f, }, 6, 0, "", "", +"62 cc 7c 08 f1 3f \tcrc32l (%r31),%r23d",}, +{{0x62, 0xcc, 0x7d, 0x08, 0xf1, 0xef, }, 6, 0, "", "", +"62 cc 7d 08 f1 ef \tcrc32 %r31w,%r21d",}, +{{0x62, 0xcc, 0x7d, 0x08, 0xf1, 0x2f, }, 6, 0, "", "", +"62 cc 7d 08 f1 2f \tcrc32w (%r31),%r21d",}, +{{0x62, 0xe4, 0xfc, 0x08, 0xf1, 0xd0, }, 6, 0, "", "", +"62 e4 fc 08 f1 d0 \tcrc32 %rax,%r18",}, +{{0x67, 0x62, 0x4c, 0x7f, 0x08, 0xf8, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 12, 0, "", "", +"67 62 4c 7f 08 f8 8c 87 23 01 00 00 \tenqcmd 0x123(%r31d,%eax,4),%r25d",}, +{{0x62, 0x4c, 0x7f, 0x08, 0xf8, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7f 08 f8 bc 87 23 01 00 00 \tenqcmd 0x123(%r31,%rax,4),%r31",}, +{{0x67, 0x62, 0x4c, 0x7e, 0x08, 0xf8, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 12, 0, "", "", +"67 62 4c 7e 08 f8 8c 87 23 01 00 00 \tenqcmds 0x123(%r31d,%eax,4),%r25d",}, +{{0x62, 0x4c, 0x7e, 0x08, 0xf8, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7e 08 f8 bc 87 23 01 00 00 \tenqcmds 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0x4c, 0x7e, 0x08, 0xf0, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7e 08 f0 bc 87 23 01 00 00 \tinvept 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0x4c, 0x7e, 0x08, 0xf2, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7e 08 f2 bc 87 23 01 00 00 \tinvpcid 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0x4c, 0x7e, 0x08, 0xf1, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7e 08 f1 bc 87 23 01 00 00 \tinvvpid 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0x61, 0x7d, 0x08, 0x93, 0xcd, }, 6, 0, "", "", +"62 61 7d 08 93 cd \tkmovb %k5,%r25d",}, +{{0x62, 0xd9, 0x7d, 0x08, 0x91, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 7d 08 91 ac 87 23 01 00 00 \tkmovb %k5,0x123(%r31,%rax,4)",}, +{{0x62, 0xd9, 0x7d, 0x08, 0x92, 0xe9, }, 6, 0, "", "", +"62 d9 7d 08 92 e9 \tkmovb %r25d,%k5",}, +{{0x62, 0xd9, 0x7d, 0x08, 0x90, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 7d 08 90 ac 87 23 01 00 00 \tkmovb 0x123(%r31,%rax,4),%k5",}, +{{0x62, 0x61, 0x7f, 0x08, 0x93, 0xcd, }, 6, 0, "", "", +"62 61 7f 08 93 cd \tkmovd %k5,%r25d",}, +{{0x62, 0xd9, 0xfd, 0x08, 0x91, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 fd 08 91 ac 87 23 01 00 00 \tkmovd %k5,0x123(%r31,%rax,4)",}, +{{0x62, 0xd9, 0x7f, 0x08, 0x92, 0xe9, }, 6, 0, "", "", +"62 d9 7f 08 92 e9 \tkmovd %r25d,%k5",}, +{{0x62, 0xd9, 0xfd, 0x08, 0x90, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 fd 08 90 ac 87 23 01 00 00 \tkmovd 0x123(%r31,%rax,4),%k5",}, +{{0x62, 0x61, 0xff, 0x08, 0x93, 0xfd, }, 6, 0, "", "", +"62 61 ff 08 93 fd \tkmovq %k5,%r31",}, +{{0x62, 0xd9, 0xfc, 0x08, 0x91, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 fc 08 91 ac 87 23 01 00 00 \tkmovq %k5,0x123(%r31,%rax,4)",}, +{{0x62, 0xd9, 0xff, 0x08, 0x92, 0xef, }, 6, 0, "", "", +"62 d9 ff 08 92 ef \tkmovq %r31,%k5",}, +{{0x62, 0xd9, 0xfc, 0x08, 0x90, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 fc 08 90 ac 87 23 01 00 00 \tkmovq 0x123(%r31,%rax,4),%k5",}, +{{0x62, 0x61, 0x7c, 0x08, 0x93, 0xcd, }, 6, 0, "", "", +"62 61 7c 08 93 cd \tkmovw %k5,%r25d",}, +{{0x62, 0xd9, 0x7c, 0x08, 0x91, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 7c 08 91 ac 87 23 01 00 00 \tkmovw %k5,0x123(%r31,%rax,4)",}, +{{0x62, 0xd9, 0x7c, 0x08, 0x92, 0xe9, }, 6, 0, "", "", +"62 d9 7c 08 92 e9 \tkmovw %r25d,%k5",}, +{{0x62, 0xd9, 0x7c, 0x08, 0x90, 0xac, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d9 7c 08 90 ac 87 23 01 00 00 \tkmovw 0x123(%r31,%rax,4),%k5",}, +{{0x62, 0xda, 0x7c, 0x08, 0x49, 0x84, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 7c 08 49 84 87 23 01 00 00 \tldtilecfg 0x123(%r31,%rax,4)",}, +{{0x62, 0xfc, 0x7d, 0x08, 0x60, 0xc2, }, 6, 0, "", "", +"62 fc 7d 08 60 c2 \tmovbe %r18w,%ax",}, +{{0x62, 0xd4, 0x7d, 0x08, 0x60, 0xc7, }, 6, 0, "", "", +"62 d4 7d 08 60 c7 \tmovbe %r15w,%ax",}, +{{0x62, 0xec, 0x7d, 0x08, 0x61, 0x94, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 ec 7d 08 61 94 80 23 01 00 00 \tmovbe %r18w,0x123(%r16,%rax,4)",}, +{{0x62, 0xcc, 0x7d, 0x08, 0x61, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 cc 7d 08 61 94 87 23 01 00 00 \tmovbe %r18w,0x123(%r31,%rax,4)",}, +{{0x62, 0xdc, 0x7c, 0x08, 0x60, 0xd1, }, 6, 0, "", "", +"62 dc 7c 08 60 d1 \tmovbe %r25d,%edx",}, +{{0x62, 0xd4, 0x7c, 0x08, 0x60, 0xd7, }, 6, 0, "", "", +"62 d4 7c 08 60 d7 \tmovbe %r15d,%edx",}, +{{0x62, 0x6c, 0x7c, 0x08, 0x61, 0x8c, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 6c 7c 08 61 8c 80 23 01 00 00 \tmovbe %r25d,0x123(%r16,%rax,4)",}, +{{0x62, 0x5c, 0xfc, 0x08, 0x60, 0xff, }, 6, 0, "", "", +"62 5c fc 08 60 ff \tmovbe %r31,%r15",}, +{{0x62, 0x54, 0xfc, 0x08, 0x60, 0xf8, }, 6, 0, "", "", +"62 54 fc 08 60 f8 \tmovbe %r8,%r15",}, +{{0x62, 0x6c, 0xfc, 0x08, 0x61, 0xbc, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 6c fc 08 61 bc 80 23 01 00 00 \tmovbe %r31,0x123(%r16,%rax,4)",}, +{{0x62, 0x4c, 0xfc, 0x08, 0x61, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c fc 08 61 bc 87 23 01 00 00 \tmovbe %r31,0x123(%r31,%rax,4)",}, +{{0x62, 0x6c, 0xfc, 0x08, 0x60, 0xbc, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 6c fc 08 60 bc 80 23 01 00 00 \tmovbe 0x123(%r16,%rax,4),%r31",}, +{{0x62, 0xcc, 0x7d, 0x08, 0x60, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 cc 7d 08 60 94 87 23 01 00 00 \tmovbe 0x123(%r31,%rax,4),%r18w",}, +{{0x62, 0x4c, 0x7c, 0x08, 0x60, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7c 08 60 8c 87 23 01 00 00 \tmovbe 0x123(%r31,%rax,4),%r25d",}, +{{0x67, 0x62, 0x4c, 0x7d, 0x08, 0xf8, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 12, 0, "", "", +"67 62 4c 7d 08 f8 8c 87 23 01 00 00 \tmovdir64b 0x123(%r31d,%eax,4),%r25d",}, +{{0x62, 0x4c, 0x7d, 0x08, 0xf8, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7d 08 f8 bc 87 23 01 00 00 \tmovdir64b 0x123(%r31,%rax,4),%r31",}, +{{0x62, 0x4c, 0x7c, 0x08, 0xf9, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7c 08 f9 8c 87 23 01 00 00 \tmovdiri %r25d,0x123(%r31,%rax,4)",}, +{{0x62, 0x4c, 0xfc, 0x08, 0xf9, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c fc 08 f9 bc 87 23 01 00 00 \tmovdiri %r31,0x123(%r31,%rax,4)",}, +{{0x62, 0x5a, 0x6f, 0x08, 0xf5, 0xd1, }, 6, 0, "", "", +"62 5a 6f 08 f5 d1 \tpdep %r25d,%edx,%r10d",}, +{{0x62, 0x5a, 0x87, 0x08, 0xf5, 0xdf, }, 6, 0, "", "", +"62 5a 87 08 f5 df \tpdep %r31,%r15,%r11",}, +{{0x62, 0xda, 0x37, 0x00, 0xf5, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 37 00 f5 94 87 23 01 00 00 \tpdep 0x123(%r31,%rax,4),%r25d,%edx",}, +{{0x62, 0x5a, 0x87, 0x00, 0xf5, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 87 00 f5 bc 87 23 01 00 00 \tpdep 0x123(%r31,%rax,4),%r31,%r15",}, +{{0x62, 0x5a, 0x6e, 0x08, 0xf5, 0xd1, }, 6, 0, "", "", +"62 5a 6e 08 f5 d1 \tpext %r25d,%edx,%r10d",}, +{{0x62, 0x5a, 0x86, 0x08, 0xf5, 0xdf, }, 6, 0, "", "", +"62 5a 86 08 f5 df \tpext %r31,%r15,%r11",}, +{{0x62, 0xda, 0x36, 0x00, 0xf5, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 36 00 f5 94 87 23 01 00 00 \tpext 0x123(%r31,%rax,4),%r25d,%edx",}, +{{0x62, 0x5a, 0x86, 0x00, 0xf5, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 86 00 f5 bc 87 23 01 00 00 \tpext 0x123(%r31,%rax,4),%r31,%r15",}, +{{0x62, 0x72, 0x35, 0x00, 0xf7, 0xd2, }, 6, 0, "", "", +"62 72 35 00 f7 d2 \tshlx %r25d,%edx,%r10d",}, +{{0x62, 0xda, 0x35, 0x00, 0xf7, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 35 00 f7 94 87 23 01 00 00 \tshlx %r25d,0x123(%r31,%rax,4),%edx",}, +{{0x62, 0x52, 0x85, 0x00, 0xf7, 0xdf, }, 6, 0, "", "", +"62 52 85 00 f7 df \tshlx %r31,%r15,%r11",}, +{{0x62, 0x5a, 0x85, 0x00, 0xf7, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 85 00 f7 bc 87 23 01 00 00 \tshlx %r31,0x123(%r31,%rax,4),%r15",}, +{{0x62, 0x72, 0x37, 0x00, 0xf7, 0xd2, }, 6, 0, "", "", +"62 72 37 00 f7 d2 \tshrx %r25d,%edx,%r10d",}, +{{0x62, 0xda, 0x37, 0x00, 0xf7, 0x94, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 37 00 f7 94 87 23 01 00 00 \tshrx %r25d,0x123(%r31,%rax,4),%edx",}, +{{0x62, 0x52, 0x87, 0x00, 0xf7, 0xdf, }, 6, 0, "", "", +"62 52 87 00 f7 df \tshrx %r31,%r15,%r11",}, +{{0x62, 0x5a, 0x87, 0x00, 0xf7, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 5a 87 00 f7 bc 87 23 01 00 00 \tshrx %r31,0x123(%r31,%rax,4),%r15",}, +{{0x62, 0xda, 0x7d, 0x08, 0x49, 0x84, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 7d 08 49 84 87 23 01 00 00 \tsttilecfg 0x123(%r31,%rax,4)",}, +{{0x62, 0xda, 0x7f, 0x08, 0x4b, 0xb4, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 7f 08 4b b4 87 23 01 00 00 \ttileloadd 0x123(%r31,%rax,4),%tmm6",}, +{{0x62, 0xda, 0x7d, 0x08, 0x4b, 0xb4, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 7d 08 4b b4 87 23 01 00 00 \ttileloaddt1 0x123(%r31,%rax,4),%tmm6",}, +{{0x62, 0xda, 0x7e, 0x08, 0x4b, 0xb4, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 da 7e 08 4b b4 87 23 01 00 00 \ttilestored %tmm6,0x123(%r31,%rax,4)",}, +{{0x62, 0xfa, 0x7d, 0x28, 0x1a, 0x18, }, 6, 0, "", "", +"62 fa 7d 28 1a 18 \tvbroadcastf32x4 (%r16),%ymm3",}, +{{0x62, 0xfa, 0x7d, 0x28, 0x5a, 0x18, }, 6, 0, "", "", +"62 fa 7d 28 5a 18 \tvbroadcasti32x4 (%r16),%ymm3",}, +{{0x62, 0xfb, 0x7d, 0x28, 0x19, 0x18, 0x01, }, 7, 0, "", "", +"62 fb 7d 28 19 18 01 \tvextractf32x4 $0x1,%ymm3,(%r16)",}, +{{0x62, 0xfb, 0x7d, 0x28, 0x39, 0x18, 0x01, }, 7, 0, "", "", +"62 fb 7d 28 39 18 01 \tvextracti32x4 $0x1,%ymm3,(%r16)",}, +{{0x62, 0x7b, 0x65, 0x28, 0x18, 0x00, 0x01, }, 7, 0, "", "", +"62 7b 65 28 18 00 01 \tvinsertf32x4 $0x1,(%r16),%ymm3,%ymm8",}, +{{0x62, 0x7b, 0x65, 0x28, 0x38, 0x00, 0x01, }, 7, 0, "", "", +"62 7b 65 28 38 00 01 \tvinserti32x4 $0x1,(%r16),%ymm3,%ymm8",}, +{{0x62, 0xdb, 0xfd, 0x08, 0x09, 0x30, 0x01, }, 7, 0, "", "", +"62 db fd 08 09 30 01 \tvrndscalepd $0x1,(%r24),%xmm6",}, +{{0x62, 0xdb, 0x7d, 0x08, 0x08, 0x30, 0x02, }, 7, 0, "", "", +"62 db 7d 08 08 30 02 \tvrndscaleps $0x2,(%r24),%xmm6",}, +{{0x62, 0xdb, 0xcd, 0x08, 0x0b, 0x18, 0x03, }, 7, 0, "", "", +"62 db cd 08 0b 18 03 \tvrndscalesd $0x3,(%r24),%xmm6,%xmm3",}, +{{0x62, 0xdb, 0x4d, 0x08, 0x0a, 0x18, 0x04, }, 7, 0, "", "", +"62 db 4d 08 0a 18 04 \tvrndscaless $0x4,(%r24),%xmm6,%xmm3",}, +{{0x62, 0x4c, 0x7c, 0x08, 0x66, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7c 08 66 8c 87 23 01 00 00 \twrssd %r25d,0x123(%r31,%rax,4)",}, +{{0x62, 0x4c, 0xfc, 0x08, 0x66, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c fc 08 66 bc 87 23 01 00 00 \twrssq %r31,0x123(%r31,%rax,4)",}, +{{0x62, 0x4c, 0x7d, 0x08, 0x65, 0x8c, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c 7d 08 65 8c 87 23 01 00 00 \twrussd %r25d,0x123(%r31,%rax,4)",}, +{{0x62, 0x4c, 0xfd, 0x08, 0x65, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 4c fd 08 65 bc 87 23 01 00 00 \twrussq %r31,0x123(%r31,%rax,4)",}, +{{0x62, 0xf4, 0x0d, 0x10, 0x81, 0xd0, 0x34, 0x12, }, 8, 0, "", "", +"62 f4 0d 10 81 d0 34 12 \tadc $0x1234,%ax,%r30w",}, +{{0x62, 0x7c, 0x6c, 0x10, 0x10, 0xf9, }, 6, 0, "", "", +"62 7c 6c 10 10 f9 \tadc %r15b,%r17b,%r18b",}, +{{0x62, 0x54, 0x6c, 0x10, 0x11, 0x38, }, 6, 0, "", "", +"62 54 6c 10 11 38 \tadc %r15d,(%r8),%r18d",}, +{{0x62, 0xc4, 0x3c, 0x18, 0x12, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3c 18 12 04 07 \tadc (%r15,%rax,1),%r16b,%r8b",}, +{{0x62, 0xc4, 0x3d, 0x18, 0x13, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3d 18 13 04 07 \tadc (%r15,%rax,1),%r16w,%r8w",}, +{{0x62, 0xfc, 0x5c, 0x10, 0x83, 0x14, 0x83, 0x11, }, 8, 0, "", "", +"62 fc 5c 10 83 14 83 11 \tadc $0x11,(%r19,%rax,4),%r20d",}, +{{0x62, 0x54, 0x6d, 0x10, 0x66, 0xc7, }, 6, 0, "", "", +"62 54 6d 10 66 c7 \tadcx %r15d,%r8d,%r18d",}, +{{0x62, 0x14, 0xf9, 0x08, 0x66, 0x04, 0x3f, }, 7, 0, "", "", +"62 14 f9 08 66 04 3f \tadcx (%r15,%r31,1),%r8",}, +{{0x62, 0x14, 0x69, 0x10, 0x66, 0x04, 0x3f, }, 7, 0, "", "", +"62 14 69 10 66 04 3f \tadcx (%r15,%r31,1),%r8d,%r18d",}, +{{0x62, 0xf4, 0x0d, 0x10, 0x81, 0xc0, 0x34, 0x12, }, 8, 0, "", "", +"62 f4 0d 10 81 c0 34 12 \tadd $0x1234,%ax,%r30w",}, +{{0x62, 0xd4, 0xfc, 0x10, 0x81, 0xc7, 0x33, 0x44, 0x34, 0x12, }, 10, 0, "", "", +"62 d4 fc 10 81 c7 33 44 34 12 \tadd $0x12344433,%r15,%r16",}, +{{0x62, 0xd4, 0x74, 0x10, 0x80, 0xc5, 0x34, }, 7, 0, "", "", +"62 d4 74 10 80 c5 34 \tadd $0x34,%r13b,%r17b",}, +{{0x62, 0xf4, 0xbc, 0x18, 0x81, 0xc0, 0x11, 0x22, 0x33, 0xf4, }, 10, 0, "", "", +"62 f4 bc 18 81 c0 11 22 33 f4 \tadd $0xfffffffff4332211,%rax,%r8",}, +{{0x62, 0x44, 0xfc, 0x10, 0x01, 0xf8, }, 6, 0, "", "", +"62 44 fc 10 01 f8 \tadd %r31,%r8,%r16",}, +{{0x62, 0x44, 0xfc, 0x10, 0x01, 0x38, }, 6, 0, "", "", +"62 44 fc 10 01 38 \tadd %r31,(%r8),%r16",}, +{{0x62, 0x44, 0xf8, 0x10, 0x01, 0x3c, 0xc0, }, 7, 0, "", "", +"62 44 f8 10 01 3c c0 \tadd %r31,(%r8,%r16,8),%r16",}, +{{0x62, 0x44, 0x7c, 0x10, 0x00, 0xf8, }, 6, 0, "", "", +"62 44 7c 10 00 f8 \tadd %r31b,%r8b,%r16b",}, +{{0x62, 0x44, 0x7c, 0x10, 0x01, 0xf8, }, 6, 0, "", "", +"62 44 7c 10 01 f8 \tadd %r31d,%r8d,%r16d",}, +{{0x62, 0x44, 0x7d, 0x10, 0x01, 0xf8, }, 6, 0, "", "", +"62 44 7d 10 01 f8 \tadd %r31w,%r8w,%r16w",}, +{{0x62, 0x5c, 0xfc, 0x10, 0x03, 0x07, }, 6, 0, "", "", +"62 5c fc 10 03 07 \tadd (%r31),%r8,%r16",}, +{{0x62, 0x5c, 0xf8, 0x10, 0x03, 0x84, 0x07, 0x90, 0x90, 0x00, 0x00, }, 11, 0, "", "", +"62 5c f8 10 03 84 07 90 90 00 00 \tadd 0x9090(%r31,%r16,1),%r8,%r16",}, +{{0x62, 0x44, 0x7c, 0x10, 0x00, 0xf8, }, 6, 0, "", "", +"62 44 7c 10 00 f8 \tadd %r31b,%r8b,%r16b",}, +{{0x62, 0x44, 0x7c, 0x10, 0x01, 0xf8, }, 6, 0, "", "", +"62 44 7c 10 01 f8 \tadd %r31d,%r8d,%r16d",}, +{{0x62, 0xfc, 0x5c, 0x10, 0x83, 0x04, 0x83, 0x11, }, 8, 0, "", "", +"62 fc 5c 10 83 04 83 11 \tadd $0x11,(%r19,%rax,4),%r20d",}, +{{0x62, 0x44, 0xfc, 0x10, 0x01, 0xf8, }, 6, 0, "", "", +"62 44 fc 10 01 f8 \tadd %r31,%r8,%r16",}, +{{0x62, 0xd4, 0xfc, 0x10, 0x81, 0x04, 0x8f, 0x33, 0x44, 0x34, 0x12, }, 11, 0, "", "", +"62 d4 fc 10 81 04 8f 33 44 34 12 \tadd $0x12344433,(%r15,%rcx,4),%r16",}, +{{0x62, 0x44, 0x7d, 0x10, 0x01, 0xf8, }, 6, 0, "", "", +"62 44 7d 10 01 f8 \tadd %r31w,%r8w,%r16w",}, +{{0x62, 0x54, 0x6e, 0x10, 0x66, 0xc7, }, 6, 0, "", "", +"62 54 6e 10 66 c7 \tadox %r15d,%r8d,%r18d",}, +{{0x62, 0x5c, 0xfc, 0x10, 0x03, 0xc7, }, 6, 0, "", "", +"62 5c fc 10 03 c7 \tadd %r31,%r8,%r16",}, +{{0x62, 0x44, 0xfc, 0x10, 0x01, 0xf8, }, 6, 0, "", "", +"62 44 fc 10 01 f8 \tadd %r31,%r8,%r16",}, +{{0x62, 0x14, 0xfa, 0x08, 0x66, 0x04, 0x3f, }, 7, 0, "", "", +"62 14 fa 08 66 04 3f \tadox (%r15,%r31,1),%r8",}, +{{0x62, 0x14, 0x6a, 0x10, 0x66, 0x04, 0x3f, }, 7, 0, "", "", +"62 14 6a 10 66 04 3f \tadox (%r15,%r31,1),%r8d,%r18d",}, +{{0x62, 0xf4, 0x0d, 0x10, 0x81, 0xe0, 0x34, 0x12, }, 8, 0, "", "", +"62 f4 0d 10 81 e0 34 12 \tand $0x1234,%ax,%r30w",}, +{{0x62, 0x7c, 0x6c, 0x10, 0x20, 0xf9, }, 6, 0, "", "", +"62 7c 6c 10 20 f9 \tand %r15b,%r17b,%r18b",}, +{{0x62, 0x54, 0x6c, 0x10, 0x21, 0x38, }, 6, 0, "", "", +"62 54 6c 10 21 38 \tand %r15d,(%r8),%r18d",}, +{{0x62, 0xc4, 0x3c, 0x18, 0x22, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3c 18 22 04 07 \tand (%r15,%rax,1),%r16b,%r8b",}, +{{0x62, 0xc4, 0x3d, 0x18, 0x23, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3d 18 23 04 07 \tand (%r15,%rax,1),%r16w,%r8w",}, +{{0x62, 0xfc, 0x5c, 0x10, 0x83, 0x24, 0x83, 0x11, }, 8, 0, "", "", +"62 fc 5c 10 83 24 83 11 \tand $0x11,(%r19,%rax,4),%r20d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x47, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 47 90 90 90 90 90 \tcmova -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x43, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 43 90 90 90 90 90 \tcmovae -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x42, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 42 90 90 90 90 90 \tcmovb -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x46, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 46 90 90 90 90 90 \tcmovbe -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x44, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 44 90 90 90 90 90 \tcmove -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x4f, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 4f 90 90 90 90 90 \tcmovg -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x4d, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 4d 90 90 90 90 90 \tcmovge -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x4c, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 4c 90 90 90 90 90 \tcmovl -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x4e, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 4e 90 90 90 90 90 \tcmovle -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x45, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 45 90 90 90 90 90 \tcmovne -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x41, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 41 90 90 90 90 90 \tcmovno -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x4b, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 4b 90 90 90 90 90 \tcmovnp -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x49, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 49 90 90 90 90 90 \tcmovns -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x40, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 40 90 90 90 90 90 \tcmovo -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x4a, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 4a 90 90 90 90 90 \tcmovp -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0x48, 0x90, 0x90, 0x90, 0x90, 0x90, }, 11, 0, "", "", +"67 62 f4 3c 18 48 90 90 90 90 90 \tcmovs -0x6f6f6f70(%eax),%edx,%r8d",}, +{{0x62, 0xf4, 0xf4, 0x10, 0xff, 0xc8, }, 6, 0, "", "", +"62 f4 f4 10 ff c8 \tdec %rax,%r17",}, +{{0x62, 0x9c, 0x3c, 0x18, 0xfe, 0x0c, 0x27, }, 7, 0, "", "", +"62 9c 3c 18 fe 0c 27 \tdec (%r31,%r12,1),%r8b",}, +{{0x62, 0xb4, 0xb0, 0x10, 0xaf, 0x94, 0xf8, 0x09, 0x09, 0x00, 0x00, }, 11, 0, "", "", +"62 b4 b0 10 af 94 f8 09 09 00 00 \timul 0x909(%rax,%r31,8),%rdx,%r25",}, +{{0x67, 0x62, 0xf4, 0x3c, 0x18, 0xaf, 0x90, 0x09, 0x09, 0x09, 0x00, }, 11, 0, "", "", +"67 62 f4 3c 18 af 90 09 09 09 00 \timul 0x90909(%eax),%edx,%r8d",}, +{{0x62, 0xdc, 0xfc, 0x10, 0xff, 0xc7, }, 6, 0, "", "", +"62 dc fc 10 ff c7 \tinc %r31,%r16",}, +{{0x62, 0xdc, 0xbc, 0x18, 0xff, 0xc7, }, 6, 0, "", "", +"62 dc bc 18 ff c7 \tinc %r31,%r8",}, +{{0x62, 0xf4, 0xe4, 0x18, 0xff, 0xc0, }, 6, 0, "", "", +"62 f4 e4 18 ff c0 \tinc %rax,%rbx",}, +{{0x62, 0xf4, 0xf4, 0x10, 0xf7, 0xd8, }, 6, 0, "", "", +"62 f4 f4 10 f7 d8 \tneg %rax,%r17",}, +{{0x62, 0x9c, 0x3c, 0x18, 0xf6, 0x1c, 0x27, }, 7, 0, "", "", +"62 9c 3c 18 f6 1c 27 \tneg (%r31,%r12,1),%r8b",}, +{{0x62, 0xf4, 0xf4, 0x10, 0xf7, 0xd0, }, 6, 0, "", "", +"62 f4 f4 10 f7 d0 \tnot %rax,%r17",}, +{{0x62, 0x9c, 0x3c, 0x18, 0xf6, 0x14, 0x27, }, 7, 0, "", "", +"62 9c 3c 18 f6 14 27 \tnot (%r31,%r12,1),%r8b",}, +{{0x62, 0xf4, 0x0d, 0x10, 0x81, 0xc8, 0x34, 0x12, }, 8, 0, "", "", +"62 f4 0d 10 81 c8 34 12 \tor $0x1234,%ax,%r30w",}, +{{0x62, 0x7c, 0x6c, 0x10, 0x08, 0xf9, }, 6, 0, "", "", +"62 7c 6c 10 08 f9 \tor %r15b,%r17b,%r18b",}, +{{0x62, 0x54, 0x6c, 0x10, 0x09, 0x38, }, 6, 0, "", "", +"62 54 6c 10 09 38 \tor %r15d,(%r8),%r18d",}, +{{0x62, 0xc4, 0x3c, 0x18, 0x0a, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3c 18 0a 04 07 \tor (%r15,%rax,1),%r16b,%r8b",}, +{{0x62, 0xc4, 0x3d, 0x18, 0x0b, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3d 18 0b 04 07 \tor (%r15,%rax,1),%r16w,%r8w",}, +{{0x62, 0xfc, 0x5c, 0x10, 0x83, 0x0c, 0x83, 0x11, }, 8, 0, "", "", +"62 fc 5c 10 83 0c 83 11 \tor $0x11,(%r19,%rax,4),%r20d",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xd4, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 d4 02 \trcl $0x2,%r12b,%r31b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xd0, }, 6, 0, "", "", +"62 fc 3c 18 d2 d0 \trcl %cl,%r16b,%r8b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x10, }, 6, 0, "", "", +"62 f4 04 10 d0 10 \trcl $1,(%rax),%r31b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x10, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 10 02 \trcl $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x10, }, 6, 0, "", "", +"62 f4 05 10 d1 10 \trcl $1,(%rax),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x14, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 14 83 \trcl %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xdc, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 dc 02 \trcr $0x2,%r12b,%r31b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xd8, }, 6, 0, "", "", +"62 fc 3c 18 d2 d8 \trcr %cl,%r16b,%r8b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x18, }, 6, 0, "", "", +"62 f4 04 10 d0 18 \trcr $1,(%rax),%r31b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x18, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 18 02 \trcr $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x18, }, 6, 0, "", "", +"62 f4 05 10 d1 18 \trcr $1,(%rax),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x1c, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 1c 83 \trcr %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xc4, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 c4 02 \trol $0x2,%r12b,%r31b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xc0, }, 6, 0, "", "", +"62 fc 3c 18 d2 c0 \trol %cl,%r16b,%r8b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x00, }, 6, 0, "", "", +"62 f4 04 10 d0 00 \trol $1,(%rax),%r31b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x00, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 00 02 \trol $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x00, }, 6, 0, "", "", +"62 f4 05 10 d1 00 \trol $1,(%rax),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x04, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 04 83 \trol %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xcc, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 cc 02 \tror $0x2,%r12b,%r31b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xc8, }, 6, 0, "", "", +"62 fc 3c 18 d2 c8 \tror %cl,%r16b,%r8b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x08, }, 6, 0, "", "", +"62 f4 04 10 d0 08 \tror $1,(%rax),%r31b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x08, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 08 02 \tror $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x08, }, 6, 0, "", "", +"62 f4 05 10 d1 08 \tror $1,(%rax),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x0c, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 0c 83 \tror %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xfc, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 fc 02 \tsar $0x2,%r12b,%r31b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xf8, }, 6, 0, "", "", +"62 fc 3c 18 d2 f8 \tsar %cl,%r16b,%r8b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x38, }, 6, 0, "", "", +"62 f4 04 10 d0 38 \tsar $1,(%rax),%r31b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x38, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 38 02 \tsar $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x38, }, 6, 0, "", "", +"62 f4 05 10 d1 38 \tsar $1,(%rax),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x3c, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 3c 83 \tsar %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xf4, 0x0d, 0x10, 0x81, 0xd8, 0x34, 0x12, }, 8, 0, "", "", +"62 f4 0d 10 81 d8 34 12 \tsbb $0x1234,%ax,%r30w",}, +{{0x62, 0x7c, 0x6c, 0x10, 0x18, 0xf9, }, 6, 0, "", "", +"62 7c 6c 10 18 f9 \tsbb %r15b,%r17b,%r18b",}, +{{0x62, 0x54, 0x6c, 0x10, 0x19, 0x38, }, 6, 0, "", "", +"62 54 6c 10 19 38 \tsbb %r15d,(%r8),%r18d",}, +{{0x62, 0xc4, 0x3c, 0x18, 0x1a, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3c 18 1a 04 07 \tsbb (%r15,%rax,1),%r16b,%r8b",}, +{{0x62, 0xc4, 0x3d, 0x18, 0x1b, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3d 18 1b 04 07 \tsbb (%r15,%rax,1),%r16w,%r8w",}, +{{0x62, 0xfc, 0x5c, 0x10, 0x83, 0x1c, 0x83, 0x11, }, 8, 0, "", "", +"62 fc 5c 10 83 1c 83 11 \tsbb $0x11,(%r19,%rax,4),%r20d",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xe4, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 e4 02 \tshl $0x2,%r12b,%r31b",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xe4, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 e4 02 \tshl $0x2,%r12b,%r31b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xe0, }, 6, 0, "", "", +"62 fc 3c 18 d2 e0 \tshl %cl,%r16b,%r8b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xe0, }, 6, 0, "", "", +"62 fc 3c 18 d2 e0 \tshl %cl,%r16b,%r8b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x20, }, 6, 0, "", "", +"62 f4 04 10 d0 20 \tshl $1,(%rax),%r31b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x20, }, 6, 0, "", "", +"62 f4 04 10 d0 20 \tshl $1,(%rax),%r31b",}, +{{0x62, 0x74, 0x84, 0x10, 0x24, 0x20, 0x01, }, 7, 0, "", "", +"62 74 84 10 24 20 01 \tshld $0x1,%r12,(%rax),%r31",}, +{{0x62, 0x74, 0x04, 0x10, 0x24, 0x38, 0x02, }, 7, 0, "", "", +"62 74 04 10 24 38 02 \tshld $0x2,%r15d,(%rax),%r31d",}, +{{0x62, 0x54, 0x05, 0x10, 0x24, 0xc4, 0x02, }, 7, 0, "", "", +"62 54 05 10 24 c4 02 \tshld $0x2,%r8w,%r12w,%r31w",}, +{{0x62, 0x7c, 0xbc, 0x18, 0xa5, 0xe0, }, 6, 0, "", "", +"62 7c bc 18 a5 e0 \tshld %cl,%r12,%r16,%r8",}, +{{0x62, 0x7c, 0x05, 0x10, 0xa5, 0x2c, 0x83, }, 7, 0, "", "", +"62 7c 05 10 a5 2c 83 \tshld %cl,%r13w,(%r19,%rax,4),%r31w",}, +{{0x62, 0x74, 0x05, 0x10, 0xa5, 0x08, }, 6, 0, "", "", +"62 74 05 10 a5 08 \tshld %cl,%r9w,(%rax),%r31w",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x20, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 20 02 \tshl $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x20, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 20 02 \tshl $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x20, }, 6, 0, "", "", +"62 f4 05 10 d1 20 \tshl $1,(%rax),%r31w",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x20, }, 6, 0, "", "", +"62 f4 05 10 d1 20 \tshl $1,(%rax),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x24, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 24 83 \tshl %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x24, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 24 83 \tshl %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xd4, 0x04, 0x10, 0xc0, 0xec, 0x02, }, 7, 0, "", "", +"62 d4 04 10 c0 ec 02 \tshr $0x2,%r12b,%r31b",}, +{{0x62, 0xfc, 0x3c, 0x18, 0xd2, 0xe8, }, 6, 0, "", "", +"62 fc 3c 18 d2 e8 \tshr %cl,%r16b,%r8b",}, +{{0x62, 0xf4, 0x04, 0x10, 0xd0, 0x28, }, 6, 0, "", "", +"62 f4 04 10 d0 28 \tshr $1,(%rax),%r31b",}, +{{0x62, 0x74, 0x84, 0x10, 0x2c, 0x20, 0x01, }, 7, 0, "", "", +"62 74 84 10 2c 20 01 \tshrd $0x1,%r12,(%rax),%r31",}, +{{0x62, 0x74, 0x04, 0x10, 0x2c, 0x38, 0x02, }, 7, 0, "", "", +"62 74 04 10 2c 38 02 \tshrd $0x2,%r15d,(%rax),%r31d",}, +{{0x62, 0x54, 0x05, 0x10, 0x2c, 0xc4, 0x02, }, 7, 0, "", "", +"62 54 05 10 2c c4 02 \tshrd $0x2,%r8w,%r12w,%r31w",}, +{{0x62, 0x7c, 0xbc, 0x18, 0xad, 0xe0, }, 6, 0, "", "", +"62 7c bc 18 ad e0 \tshrd %cl,%r12,%r16,%r8",}, +{{0x62, 0x7c, 0x05, 0x10, 0xad, 0x2c, 0x83, }, 7, 0, "", "", +"62 7c 05 10 ad 2c 83 \tshrd %cl,%r13w,(%r19,%rax,4),%r31w",}, +{{0x62, 0x74, 0x05, 0x10, 0xad, 0x08, }, 6, 0, "", "", +"62 74 05 10 ad 08 \tshrd %cl,%r9w,(%rax),%r31w",}, +{{0x62, 0xf4, 0x04, 0x10, 0xc1, 0x28, 0x02, }, 7, 0, "", "", +"62 f4 04 10 c1 28 02 \tshr $0x2,(%rax),%r31d",}, +{{0x62, 0xf4, 0x05, 0x10, 0xd1, 0x28, }, 6, 0, "", "", +"62 f4 05 10 d1 28 \tshr $1,(%rax),%r31w",}, +{{0x62, 0xfc, 0x05, 0x10, 0xd3, 0x2c, 0x83, }, 7, 0, "", "", +"62 fc 05 10 d3 2c 83 \tshr %cl,(%r19,%rax,4),%r31w",}, +{{0x62, 0xf4, 0x0d, 0x10, 0x81, 0xe8, 0x34, 0x12, }, 8, 0, "", "", +"62 f4 0d 10 81 e8 34 12 \tsub $0x1234,%ax,%r30w",}, +{{0x62, 0x7c, 0x6c, 0x10, 0x28, 0xf9, }, 6, 0, "", "", +"62 7c 6c 10 28 f9 \tsub %r15b,%r17b,%r18b",}, +{{0x62, 0x54, 0x6c, 0x10, 0x29, 0x38, }, 6, 0, "", "", +"62 54 6c 10 29 38 \tsub %r15d,(%r8),%r18d",}, +{{0x62, 0xc4, 0x3c, 0x18, 0x2a, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3c 18 2a 04 07 \tsub (%r15,%rax,1),%r16b,%r8b",}, +{{0x62, 0xc4, 0x3d, 0x18, 0x2b, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3d 18 2b 04 07 \tsub (%r15,%rax,1),%r16w,%r8w",}, +{{0x62, 0xfc, 0x5c, 0x10, 0x83, 0x2c, 0x83, 0x11, }, 8, 0, "", "", +"62 fc 5c 10 83 2c 83 11 \tsub $0x11,(%r19,%rax,4),%r20d",}, +{{0x62, 0xf4, 0x0d, 0x10, 0x81, 0xf0, 0x34, 0x12, }, 8, 0, "", "", +"62 f4 0d 10 81 f0 34 12 \txor $0x1234,%ax,%r30w",}, +{{0x62, 0x7c, 0x6c, 0x10, 0x30, 0xf9, }, 6, 0, "", "", +"62 7c 6c 10 30 f9 \txor %r15b,%r17b,%r18b",}, +{{0x62, 0x54, 0x6c, 0x10, 0x31, 0x38, }, 6, 0, "", "", +"62 54 6c 10 31 38 \txor %r15d,(%r8),%r18d",}, +{{0x62, 0xc4, 0x3c, 0x18, 0x32, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3c 18 32 04 07 \txor (%r15,%rax,1),%r16b,%r8b",}, +{{0x62, 0xc4, 0x3d, 0x18, 0x33, 0x04, 0x07, }, 7, 0, "", "", +"62 c4 3d 18 33 04 07 \txor (%r15,%rax,1),%r16w,%r8w",}, +{{0x62, 0xfc, 0x5c, 0x10, 0x83, 0x34, 0x83, 0x11, }, 8, 0, "", "", +"62 fc 5c 10 83 34 83 11 \txor $0x11,(%r19,%rax,4),%r20d",}, +{{0x62, 0xf4, 0x3c, 0x1c, 0x00, 0xda, }, 6, 0, "", "", +"62 f4 3c 1c 00 da \t{nf} add %bl,%dl,%r8b",}, +{{0x62, 0xf4, 0x35, 0x1c, 0x01, 0xd0, }, 6, 0, "", "", +"62 f4 35 1c 01 d0 \t{nf} add %dx,%ax,%r9w",}, +{{0x62, 0xd4, 0x6c, 0x1c, 0x02, 0x9c, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 6c 1c 02 9c 80 23 01 00 00 \t{nf} add 0x123(%r8,%rax,4),%bl,%dl",}, +{{0x62, 0xd4, 0x7d, 0x1c, 0x03, 0x94, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 7d 1c 03 94 80 23 01 00 00 \t{nf} add 0x123(%r8,%rax,4),%dx,%ax",}, +{{0x62, 0xf4, 0x3c, 0x1c, 0x08, 0xda, }, 6, 0, "", "", +"62 f4 3c 1c 08 da \t{nf} or %bl,%dl,%r8b",}, +{{0x62, 0xf4, 0x35, 0x1c, 0x09, 0xd0, }, 6, 0, "", "", +"62 f4 35 1c 09 d0 \t{nf} or %dx,%ax,%r9w",}, +{{0x62, 0xd4, 0x6c, 0x1c, 0x0a, 0x9c, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 6c 1c 0a 9c 80 23 01 00 00 \t{nf} or 0x123(%r8,%rax,4),%bl,%dl",}, +{{0x62, 0xd4, 0x7d, 0x1c, 0x0b, 0x94, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 7d 1c 0b 94 80 23 01 00 00 \t{nf} or 0x123(%r8,%rax,4),%dx,%ax",}, +{{0x62, 0xf4, 0x3c, 0x1c, 0x20, 0xda, }, 6, 0, "", "", +"62 f4 3c 1c 20 da \t{nf} and %bl,%dl,%r8b",}, +{{0x62, 0xf4, 0x35, 0x1c, 0x21, 0xd0, }, 6, 0, "", "", +"62 f4 35 1c 21 d0 \t{nf} and %dx,%ax,%r9w",}, +{{0x62, 0xd4, 0x6c, 0x1c, 0x22, 0x9c, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 6c 1c 22 9c 80 23 01 00 00 \t{nf} and 0x123(%r8,%rax,4),%bl,%dl",}, +{{0x62, 0xd4, 0x7d, 0x1c, 0x23, 0x94, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 7d 1c 23 94 80 23 01 00 00 \t{nf} and 0x123(%r8,%rax,4),%dx,%ax",}, +{{0x62, 0xf4, 0x35, 0x1c, 0x24, 0xd0, 0x7b, }, 7, 0, "", "", +"62 f4 35 1c 24 d0 7b \t{nf} shld $0x7b,%dx,%ax,%r9w",}, +{{0x62, 0xf4, 0x3c, 0x1c, 0x28, 0xda, }, 6, 0, "", "", +"62 f4 3c 1c 28 da \t{nf} sub %bl,%dl,%r8b",}, +{{0x62, 0xf4, 0x35, 0x1c, 0x29, 0xd0, }, 6, 0, "", "", +"62 f4 35 1c 29 d0 \t{nf} sub %dx,%ax,%r9w",}, +{{0x62, 0xd4, 0x6c, 0x1c, 0x2a, 0x9c, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 6c 1c 2a 9c 80 23 01 00 00 \t{nf} sub 0x123(%r8,%rax,4),%bl,%dl",}, +{{0x62, 0xd4, 0x7d, 0x1c, 0x2b, 0x94, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 7d 1c 2b 94 80 23 01 00 00 \t{nf} sub 0x123(%r8,%rax,4),%dx,%ax",}, +{{0x62, 0xf4, 0x35, 0x1c, 0x2c, 0xd0, 0x7b, }, 7, 0, "", "", +"62 f4 35 1c 2c d0 7b \t{nf} shrd $0x7b,%dx,%ax,%r9w",}, +{{0x62, 0xf4, 0x3c, 0x1c, 0x30, 0xda, }, 6, 0, "", "", +"62 f4 3c 1c 30 da \t{nf} xor %bl,%dl,%r8b",}, +{{0x62, 0x4c, 0xfc, 0x0c, 0x31, 0xff, }, 6, 0, "", "", +"62 4c fc 0c 31 ff \t{nf} xor %r31,%r31",}, +{{0x62, 0xd4, 0x6c, 0x1c, 0x32, 0x9c, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 6c 1c 32 9c 80 23 01 00 00 \t{nf} xor 0x123(%r8,%rax,4),%bl,%dl",}, +{{0x62, 0xd4, 0x7d, 0x1c, 0x33, 0x94, 0x80, 0x23, 0x01, 0x00, 0x00, }, 11, 0, "", "", +"62 d4 7d 1c 33 94 80 23 01 00 00 \t{nf} xor 0x123(%r8,%rax,4),%dx,%ax",}, +{{0x62, 0x54, 0xfc, 0x0c, 0x69, 0xf9, 0x90, 0xff, 0x00, 0x00, }, 10, 0, "", "", +"62 54 fc 0c 69 f9 90 ff 00 00 \t{nf} imul $0xff90,%r9,%r15",}, +{{0x62, 0x54, 0xfc, 0x0c, 0x6b, 0xf9, 0x7b, }, 7, 0, "", "", +"62 54 fc 0c 6b f9 7b \t{nf} imul $0x7b,%r9,%r15",}, +{{0x62, 0xf4, 0x6c, 0x1c, 0x80, 0xf3, 0x7b, }, 7, 0, "", "", +"62 f4 6c 1c 80 f3 7b \t{nf} xor $0x7b,%bl,%dl",}, +{{0x62, 0xf4, 0x7d, 0x1c, 0x83, 0xf2, 0x7b, }, 7, 0, "", "", +"62 f4 7d 1c 83 f2 7b \t{nf} xor $0x7b,%dx,%ax",}, +{{0x62, 0x44, 0xfc, 0x0c, 0x88, 0xf9, }, 6, 0, "", "", +"62 44 fc 0c 88 f9 \t{nf} popcnt %r9,%r31",}, +{{0x62, 0xf4, 0x35, 0x1c, 0xa5, 0xd0, }, 6, 0, "", "", +"62 f4 35 1c a5 d0 \t{nf} shld %cl,%dx,%ax,%r9w",}, +{{0x62, 0xf4, 0x35, 0x1c, 0xad, 0xd0, }, 6, 0, "", "", +"62 f4 35 1c ad d0 \t{nf} shrd %cl,%dx,%ax,%r9w",}, +{{0x62, 0x44, 0xa4, 0x1c, 0xaf, 0xf9, }, 6, 0, "", "", +"62 44 a4 1c af f9 \t{nf} imul %r9,%r31,%r11",}, +{{0x62, 0xf4, 0x6c, 0x1c, 0xc0, 0xfb, 0x7b, }, 7, 0, "", "", +"62 f4 6c 1c c0 fb 7b \t{nf} sar $0x7b,%bl,%dl",}, +{{0x62, 0xf4, 0x7d, 0x1c, 0xc1, 0xfa, 0x7b, }, 7, 0, "", "", +"62 f4 7d 1c c1 fa 7b \t{nf} sar $0x7b,%dx,%ax",}, +{{0x62, 0xf4, 0x6c, 0x1c, 0xd0, 0xfb, }, 6, 0, "", "", +"62 f4 6c 1c d0 fb \t{nf} sar $1,%bl,%dl",}, +{{0x62, 0xf4, 0x7d, 0x1c, 0xd1, 0xfa, }, 6, 0, "", "", +"62 f4 7d 1c d1 fa \t{nf} sar $1,%dx,%ax",}, +{{0x62, 0xf4, 0x6c, 0x1c, 0xd2, 0xfb, }, 6, 0, "", "", +"62 f4 6c 1c d2 fb \t{nf} sar %cl,%bl,%dl",}, +{{0x62, 0xf4, 0x7d, 0x1c, 0xd3, 0xfa, }, 6, 0, "", "", +"62 f4 7d 1c d3 fa \t{nf} sar %cl,%dx,%ax",}, +{{0x62, 0x52, 0x84, 0x04, 0xf2, 0xd9, }, 6, 0, "", "", +"62 52 84 04 f2 d9 \t{nf} andn %r9,%r31,%r11",}, +{{0x62, 0xd2, 0x84, 0x04, 0xf3, 0xd9, }, 6, 0, "", "", +"62 d2 84 04 f3 d9 \t{nf} blsi %r9,%r31",}, +{{0x62, 0x44, 0xfc, 0x0c, 0xf4, 0xf9, }, 6, 0, "", "", +"62 44 fc 0c f4 f9 \t{nf} tzcnt %r9,%r31",}, +{{0x62, 0x44, 0xfc, 0x0c, 0xf5, 0xf9, }, 6, 0, "", "", +"62 44 fc 0c f5 f9 \t{nf} lzcnt %r9,%r31",}, +{{0x62, 0xf4, 0x7c, 0x0c, 0xf6, 0xfb, }, 6, 0, "", "", +"62 f4 7c 0c f6 fb \t{nf} idiv %bl",}, +{{0x62, 0xf4, 0x7d, 0x0c, 0xf7, 0xfa, }, 6, 0, "", "", +"62 f4 7d 0c f7 fa \t{nf} idiv %dx",}, +{{0x62, 0xf4, 0x6c, 0x1c, 0xfe, 0xcb, }, 6, 0, "", "", +"62 f4 6c 1c fe cb \t{nf} dec %bl,%dl",}, +{{0x62, 0xf4, 0x7d, 0x1c, 0xff, 0xca, }, 6, 0, "", "", +"62 f4 7d 1c ff ca \t{nf} dec %dx,%ax",}, +{{0xf3, 0x0f, 0x38, 0xdc, 0xd1, }, 5, 0, "", "", +"f3 0f 38 dc d1 \tloadiwkey %xmm1,%xmm2",}, +{{0xf3, 0x0f, 0x38, 0xfa, 0xd0, }, 5, 0, "", "", +"f3 0f 38 fa d0 \tencodekey128 %eax,%edx",}, +{{0xf3, 0x0f, 0x38, 0xfb, 0xd0, }, 5, 0, "", "", +"f3 0f 38 fb d0 \tencodekey256 %eax,%edx",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xdc, 0x5a, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 dc 5a 77 \taesenc128kl 0x77(%edx),%xmm3",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xde, 0x5a, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 de 5a 77 \taesenc256kl 0x77(%edx),%xmm3",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xdd, 0x5a, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 dd 5a 77 \taesdec128kl 0x77(%edx),%xmm3",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xdf, 0x5a, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 df 5a 77 \taesdec256kl 0x77(%edx),%xmm3",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xd8, 0x42, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 d8 42 77 \taesencwide128kl 0x77(%edx)",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xd8, 0x52, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 d8 52 77 \taesencwide256kl 0x77(%edx)",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xd8, 0x4a, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 d8 4a 77 \taesdecwide128kl 0x77(%edx)",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xd8, 0x5a, 0x77, }, 7, 0, "", "", +"67 f3 0f 38 d8 5a 77 \taesdecwide256kl 0x77(%edx)",}, +{{0x67, 0x0f, 0x38, 0xfc, 0x08, }, 5, 0, "", "", +"67 0f 38 fc 08 \taadd %ecx,(%eax)",}, +{{0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 9, 0, "", "", +"0f 38 fc 14 25 78 56 34 12 \taadd %edx,0x12345678",}, +{{0x67, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"67 0f 38 fc 94 c8 78 56 34 12 \taadd %edx,0x12345678(%eax,%ecx,8)",}, +{{0x67, 0x66, 0x0f, 0x38, 0xfc, 0x08, }, 6, 0, "", "", +"67 66 0f 38 fc 08 \taand %ecx,(%eax)",}, +{{0x66, 0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"66 0f 38 fc 14 25 78 56 34 12 \taand %edx,0x12345678",}, +{{0x67, 0x66, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"67 66 0f 38 fc 94 c8 78 56 34 12 \taand %edx,0x12345678(%eax,%ecx,8)",}, +{{0x67, 0xf2, 0x0f, 0x38, 0xfc, 0x08, }, 6, 0, "", "", +"67 f2 0f 38 fc 08 \taor %ecx,(%eax)",}, +{{0xf2, 0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"f2 0f 38 fc 14 25 78 56 34 12 \taor %edx,0x12345678",}, +{{0x67, 0xf2, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"67 f2 0f 38 fc 94 c8 78 56 34 12 \taor %edx,0x12345678(%eax,%ecx,8)",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xfc, 0x08, }, 6, 0, "", "", +"67 f3 0f 38 fc 08 \taxor %ecx,(%eax)",}, +{{0xf3, 0x0f, 0x38, 0xfc, 0x14, 0x25, 0x78, 0x56, 0x34, 0x12, }, 10, 0, "", "", +"f3 0f 38 fc 14 25 78 56 34 12 \taxor %edx,0x12345678",}, +{{0x67, 0xf3, 0x0f, 0x38, 0xfc, 0x94, 0xc8, 0x78, 0x56, 0x34, 0x12, }, 11, 0, "", "", +"67 f3 0f 38 fc 94 c8 78 56 34 12 \taxor %edx,0x12345678(%eax,%ecx,8)",}, +{{0x67, 0xc4, 0xe2, 0x7a, 0xb1, 0x31, }, 6, 0, "", "", +"67 c4 e2 7a b1 31 \tvbcstnebf162ps (%ecx),%xmm6",}, +{{0x67, 0xc4, 0xe2, 0x79, 0xb1, 0x31, }, 6, 0, "", "", +"67 c4 e2 79 b1 31 \tvbcstnesh2ps (%ecx),%xmm6",}, +{{0x67, 0xc4, 0xe2, 0x7a, 0xb0, 0x31, }, 6, 0, "", "", +"67 c4 e2 7a b0 31 \tvcvtneebf162ps (%ecx),%xmm6",}, +{{0x67, 0xc4, 0xe2, 0x79, 0xb0, 0x31, }, 6, 0, "", "", +"67 c4 e2 79 b0 31 \tvcvtneeph2ps (%ecx),%xmm6",}, +{{0x67, 0xc4, 0xe2, 0x7b, 0xb0, 0x31, }, 6, 0, "", "", +"67 c4 e2 7b b0 31 \tvcvtneobf162ps (%ecx),%xmm6",}, +{{0x67, 0xc4, 0xe2, 0x78, 0xb0, 0x31, }, 6, 0, "", "", +"67 c4 e2 78 b0 31 \tvcvtneoph2ps (%ecx),%xmm6",}, +{{0x62, 0xf2, 0x7e, 0x08, 0x72, 0xf1, }, 6, 0, "", "", +"62 f2 7e 08 72 f1 \tvcvtneps2bf16 %xmm1,%xmm6",}, +{{0xc4, 0xe2, 0x6b, 0x50, 0xd9, }, 5, 0, "", "", +"c4 e2 6b 50 d9 \tvpdpbssd %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6b, 0x51, 0xd9, }, 5, 0, "", "", +"c4 e2 6b 51 d9 \tvpdpbssds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0x50, 0xd9, }, 5, 0, "", "", +"c4 e2 6a 50 d9 \tvpdpbsud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0x51, 0xd9, }, 5, 0, "", "", +"c4 e2 6a 51 d9 \tvpdpbsuds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0x50, 0xd9, }, 5, 0, "", "", +"c4 e2 68 50 d9 \tvpdpbuud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0x51, 0xd9, }, 5, 0, "", "", +"c4 e2 68 51 d9 \tvpdpbuuds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0xd2, 0xd9, }, 5, 0, "", "", +"c4 e2 6a d2 d9 \tvpdpwsud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0xd3, 0xd9, }, 5, 0, "", "", +"c4 e2 6a d3 d9 \tvpdpwsuds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x69, 0xd2, 0xd9, }, 5, 0, "", "", +"c4 e2 69 d2 d9 \tvpdpwusd %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x69, 0xd3, 0xd9, }, 5, 0, "", "", +"c4 e2 69 d3 d9 \tvpdpwusds %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0xd2, 0xd9, }, 5, 0, "", "", +"c4 e2 68 d2 d9 \tvpdpwuud %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x68, 0xd3, 0xd9, }, 5, 0, "", "", +"c4 e2 68 d3 d9 \tvpdpwuuds %xmm1,%xmm2,%xmm3",}, +{{0x62, 0xf2, 0xed, 0x08, 0xb5, 0xd9, }, 6, 0, "", "", +"62 f2 ed 08 b5 d9 \tvpmadd52huq %xmm1,%xmm2,%xmm3",}, +{{0x62, 0xf2, 0xed, 0x08, 0xb4, 0xd9, }, 6, 0, "", "", +"62 f2 ed 08 b4 d9 \tvpmadd52luq %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x7f, 0xcc, 0xd1, }, 5, 0, "", "", +"c4 e2 7f cc d1 \tvsha512msg1 %xmm1,%ymm2",}, +{{0xc4, 0xe2, 0x7f, 0xcd, 0xd1, }, 5, 0, "", "", +"c4 e2 7f cd d1 \tvsha512msg2 %ymm1,%ymm2",}, +{{0xc4, 0xe2, 0x6f, 0xcb, 0xd9, }, 5, 0, "", "", +"c4 e2 6f cb d9 \tvsha512rnds2 %xmm1,%ymm2,%ymm3",}, +{{0xc4, 0xe2, 0x68, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 68 da d9 \tvsm3msg1 %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x69, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 69 da d9 \tvsm3msg2 %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe3, 0x69, 0xde, 0xd9, 0xa1, }, 6, 0, "", "", +"c4 e3 69 de d9 a1 \tvsm3rnds2 $0xa1,%xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6a, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 6a da d9 \tvsm4key4 %xmm1,%xmm2,%xmm3",}, +{{0xc4, 0xe2, 0x6b, 0xda, 0xd9, }, 5, 0, "", "", +"c4 e2 6b da d9 \tvsm4rnds4 %xmm1,%xmm2,%xmm3",}, +{{0x67, 0x0f, 0x0d, 0x00, }, 4, 0, "", "", +"67 0f 0d 00 \tprefetch (%eax)",}, +{{0x67, 0x0f, 0x18, 0x08, }, 4, 0, "", "", +"67 0f 18 08 \tprefetcht0 (%eax)",}, +{{0x67, 0x0f, 0x18, 0x10, }, 4, 0, "", "", +"67 0f 18 10 \tprefetcht1 (%eax)",}, +{{0x67, 0x0f, 0x18, 0x18, }, 4, 0, "", "", +"67 0f 18 18 \tprefetcht2 (%eax)",}, +{{0x67, 0x0f, 0x18, 0x00, }, 4, 0, "", "", +"67 0f 18 00 \tprefetchnta (%eax)",}, +{{0x0f, 0x01, 0xc6, }, 3, 0, "", "", +"0f 01 c6 \twrmsrns",}, {{0xf3, 0x0f, 0x3a, 0xf0, 0xc0, 0x00, }, 6, 0, "", "", "f3 0f 3a f0 c0 00 \threset $0x0",}, {{0x0f, 0x01, 0xe8, }, 3, 0, "", "", diff --git a/tools/perf/arch/x86/tests/insn-x86-dat-src.c b/tools/perf/arch/x86/tests/insn-x86-dat-src.c index a391464c8dee..f55505c75d51 100644 --- a/tools/perf/arch/x86/tests/insn-x86-dat-src.c +++ b/tools/perf/arch/x86/tests/insn-x86-dat-src.c @@ -2628,6 +2628,512 @@ int main(void) asm volatile("vucomish 0x12345678(%rax,%rcx,8), %xmm1"); asm volatile("vucomish 0x12345678(%eax,%ecx,8), %xmm1"); + /* Key Locker */ + + asm volatile("loadiwkey %xmm1, %xmm2"); + asm volatile("encodekey128 %eax, %edx"); + asm volatile("encodekey256 %eax, %edx"); + asm volatile("aesenc128kl 0x77(%rdx), %xmm3"); + asm volatile("aesenc256kl 0x77(%rdx), %xmm3"); + asm volatile("aesdec128kl 0x77(%rdx), %xmm3"); + asm volatile("aesdec256kl 0x77(%rdx), %xmm3"); + asm volatile("aesencwide128kl 0x77(%rdx)"); + asm volatile("aesencwide256kl 0x77(%rdx)"); + asm volatile("aesdecwide128kl 0x77(%rdx)"); + asm volatile("aesdecwide256kl 0x77(%rdx)"); + + /* Remote Atomic Operations */ + + asm volatile("aadd %ecx,(%rax)"); + asm volatile("aadd %edx,(%r8)"); + asm volatile("aadd %edx,0x12345678(%rax,%rcx,8)"); + asm volatile("aadd %edx,0x12345678(%r8,%rcx,8)"); + asm volatile("aadd %rcx,(%rax)"); + asm volatile("aadd %rdx,(%r8)"); + asm volatile("aadd %rdx,(0x12345678)"); + asm volatile("aadd %rdx,0x12345678(%rax,%rcx,8)"); + asm volatile("aadd %rdx,0x12345678(%r8,%rcx,8)"); + + asm volatile("aand %ecx,(%rax)"); + asm volatile("aand %edx,(%r8)"); + asm volatile("aand %edx,0x12345678(%rax,%rcx,8)"); + asm volatile("aand %edx,0x12345678(%r8,%rcx,8)"); + asm volatile("aand %rcx,(%rax)"); + asm volatile("aand %rdx,(%r8)"); + asm volatile("aand %rdx,(0x12345678)"); + asm volatile("aand %rdx,0x12345678(%rax,%rcx,8)"); + asm volatile("aand %rdx,0x12345678(%r8,%rcx,8)"); + + asm volatile("aor %ecx,(%rax)"); + asm volatile("aor %edx,(%r8)"); + asm volatile("aor %edx,0x12345678(%rax,%rcx,8)"); + asm volatile("aor %edx,0x12345678(%r8,%rcx,8)"); + asm volatile("aor %rcx,(%rax)"); + asm volatile("aor %rdx,(%r8)"); + asm volatile("aor %rdx,(0x12345678)"); + asm volatile("aor %rdx,0x12345678(%rax,%rcx,8)"); + asm volatile("aor %rdx,0x12345678(%r8,%rcx,8)"); + + asm volatile("axor %ecx,(%rax)"); + asm volatile("axor %edx,(%r8)"); + asm volatile("axor %edx,0x12345678(%rax,%rcx,8)"); + asm volatile("axor %edx,0x12345678(%r8,%rcx,8)"); + asm volatile("axor %rcx,(%rax)"); + asm volatile("axor %rdx,(%r8)"); + asm volatile("axor %rdx,(0x12345678)"); + asm volatile("axor %rdx,0x12345678(%rax,%rcx,8)"); + asm volatile("axor %rdx,0x12345678(%r8,%rcx,8)"); + + /* VEX CMPxxXADD */ + + asm volatile("cmpbexadd %ebx,%ecx,(%r9)"); + asm volatile("cmpbxadd %ebx,%ecx,(%r9)"); + asm volatile("cmplexadd %ebx,%ecx,(%r9)"); + asm volatile("cmplxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnbexadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnbxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnlexadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnlxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnoxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnpxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnsxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpnzxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpoxadd %ebx,%ecx,(%r9)"); + asm volatile("cmppxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpsxadd %ebx,%ecx,(%r9)"); + asm volatile("cmpzxadd %ebx,%ecx,(%r9)"); + + /* Pre-fetch */ + + asm volatile("prefetch (%rax)"); + asm volatile("prefetcht0 (%rax)"); + asm volatile("prefetcht1 (%rax)"); + asm volatile("prefetcht2 (%rax)"); + asm volatile("prefetchnta (%rax)"); + asm volatile("prefetchit0 0x12345678(%rip)"); + asm volatile("prefetchit1 0x12345678(%rip)"); + + /* MSR List */ + + asm volatile("rdmsrlist"); + asm volatile("wrmsrlist"); + + /* User Read/Write MSR */ + + asm volatile("urdmsr %rdx,%rax"); + asm volatile("urdmsr %rdx,%r22"); + asm volatile("urdmsr $0x7f,%r12"); + asm volatile("uwrmsr %rax,%rdx"); + asm volatile("uwrmsr %r22,%rdx"); + asm volatile("uwrmsr %r12,$0x7f"); + + /* AVX NE Convert */ + + asm volatile("vbcstnebf162ps (%rcx),%xmm6"); + asm volatile("vbcstnesh2ps (%rcx),%xmm6"); + asm volatile("vcvtneebf162ps (%rcx),%xmm6"); + asm volatile("vcvtneeph2ps (%rcx),%xmm6"); + asm volatile("vcvtneobf162ps (%rcx),%xmm6"); + asm volatile("vcvtneoph2ps (%rcx),%xmm6"); + asm volatile("vcvtneps2bf16 %xmm1,%xmm6"); + + /* FRED */ + + asm volatile("erets"); /* Expecting: erets indirect 0 */ + asm volatile("eretu"); /* Expecting: eretu indirect 0 */ + + /* AMX Complex */ + + asm volatile("tcmmimfp16ps %tmm1,%tmm2,%tmm3"); + asm volatile("tcmmrlfp16ps %tmm1,%tmm2,%tmm3"); + + /* AMX FP16 */ + + asm volatile("tdpfp16ps %tmm1,%tmm2,%tmm3"); + + /* REX2 */ + + asm volatile("test $0x5, %r18b"); + asm volatile("test $0x5, %r18d"); + asm volatile("test $0x5, %r18"); + asm volatile("test $0x5, %r18w"); + asm volatile("imull %eax, %r14d"); + asm volatile("imull %eax, %r17d"); + asm volatile("punpckldq (%r18), %mm2"); + asm volatile("leal (%rax), %r16d"); + asm volatile("leal (%rax), %r31d"); + asm volatile("leal (,%r16), %eax"); + asm volatile("leal (,%r31), %eax"); + asm volatile("leal (%r16), %eax"); + asm volatile("leal (%r31), %eax"); + asm volatile("leaq (%rax), %r15"); + asm volatile("leaq (%rax), %r16"); + asm volatile("leaq (%r15), %rax"); + asm volatile("leaq (%r16), %rax"); + asm volatile("leaq (,%r15), %rax"); + asm volatile("leaq (,%r16), %rax"); + asm volatile("add (%r16), %r8"); + asm volatile("add (%r16), %r15"); + asm volatile("mov (,%r9), %r16"); + asm volatile("mov (,%r14), %r16"); + asm volatile("sub (%r10), %r31"); + asm volatile("sub (%r13), %r31"); + asm volatile("leal 1(%r16, %r21), %eax"); + asm volatile("leal 1(%r16, %r26), %r31d"); + asm volatile("leal 129(%r21, %r9), %eax"); + asm volatile("leal 129(%r26, %r9), %r31d"); + /* + * Have to use .byte for jmpabs because gas does not support the + * mnemonic for some reason, but then it also gets the source line wrong + * with .byte, so the following is a workaround. + */ + asm volatile(""); /* Expecting: jmp indirect 0 */ + asm volatile(".byte 0xd5, 0x00, 0xa1, 0xef, 0xcd, 0xab, 0x90, 0x78, 0x56, 0x34, 0x12"); + asm volatile("pushp %rbx"); + asm volatile("pushp %r16"); + asm volatile("pushp %r31"); + asm volatile("popp %r31"); + asm volatile("popp %r16"); + asm volatile("popp %rbx"); + + /* APX */ + + asm volatile("bextr %r25d,%edx,%r10d"); + asm volatile("bextr %r25d,0x123(%r31,%rax,4),%edx"); + asm volatile("bextr %r31,%r15,%r11"); + asm volatile("bextr %r31,0x123(%r31,%rax,4),%r15"); + asm volatile("blsi %r25d,%edx"); + asm volatile("blsi %r31,%r15"); + asm volatile("blsi 0x123(%r31,%rax,4),%r25d"); + asm volatile("blsi 0x123(%r31,%rax,4),%r31"); + asm volatile("blsmsk %r25d,%edx"); + asm volatile("blsmsk %r31,%r15"); + asm volatile("blsmsk 0x123(%r31,%rax,4),%r25d"); + asm volatile("blsmsk 0x123(%r31,%rax,4),%r31"); + asm volatile("blsr %r25d,%edx"); + asm volatile("blsr %r31,%r15"); + asm volatile("blsr 0x123(%r31,%rax,4),%r25d"); + asm volatile("blsr 0x123(%r31,%rax,4),%r31"); + asm volatile("bzhi %r25d,%edx,%r10d"); + asm volatile("bzhi %r25d,0x123(%r31,%rax,4),%edx"); + asm volatile("bzhi %r31,%r15,%r11"); + asm volatile("bzhi %r31,0x123(%r31,%rax,4),%r15"); + asm volatile("cmpbexadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpbexadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpbxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpbxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmplxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmplxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnbexadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnbexadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnbxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnbxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnlexadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnlexadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnlxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnlxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnoxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnoxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnpxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnpxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnsxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnsxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpnzxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpnzxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpoxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpoxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmppxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmppxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpsxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpsxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("cmpzxadd %r25d,%edx,0x123(%r31,%rax,4)"); + asm volatile("cmpzxadd %r31,%r15,0x123(%r31,%rax,4)"); + asm volatile("crc32q %r31, %r22"); + asm volatile("crc32q (%r31), %r22"); + asm volatile("crc32b %r19b, %r17"); + asm volatile("crc32b %r19b, %r21d"); + asm volatile("crc32b (%r19),%ebx"); + asm volatile("crc32l %r31d, %r23d"); + asm volatile("crc32l (%r31), %r23d"); + asm volatile("crc32w %r31w, %r21d"); + asm volatile("crc32w (%r31),%r21d"); + asm volatile("crc32 %rax, %r18"); + asm volatile("enqcmd 0x123(%r31d,%eax,4),%r25d"); + asm volatile("enqcmd 0x123(%r31,%rax,4),%r31"); + asm volatile("enqcmds 0x123(%r31d,%eax,4),%r25d"); + asm volatile("enqcmds 0x123(%r31,%rax,4),%r31"); + asm volatile("invept 0x123(%r31,%rax,4),%r31"); + asm volatile("invpcid 0x123(%r31,%rax,4),%r31"); + asm volatile("invvpid 0x123(%r31,%rax,4),%r31"); + asm volatile("kmovb %k5,%r25d"); + asm volatile("kmovb %k5,0x123(%r31,%rax,4)"); + asm volatile("kmovb %r25d,%k5"); + asm volatile("kmovb 0x123(%r31,%rax,4),%k5"); + asm volatile("kmovd %k5,%r25d"); + asm volatile("kmovd %k5,0x123(%r31,%rax,4)"); + asm volatile("kmovd %r25d,%k5"); + asm volatile("kmovd 0x123(%r31,%rax,4),%k5"); + asm volatile("kmovq %k5,%r31"); + asm volatile("kmovq %k5,0x123(%r31,%rax,4)"); + asm volatile("kmovq %r31,%k5"); + asm volatile("kmovq 0x123(%r31,%rax,4),%k5"); + asm volatile("kmovw %k5,%r25d"); + asm volatile("kmovw %k5,0x123(%r31,%rax,4)"); + asm volatile("kmovw %r25d,%k5"); + asm volatile("kmovw 0x123(%r31,%rax,4),%k5"); + asm volatile("ldtilecfg 0x123(%r31,%rax,4)"); + asm volatile("movbe %r18w,%ax"); + asm volatile("movbe %r15w,%ax"); + asm volatile("movbe %r18w,0x123(%r16,%rax,4)"); + asm volatile("movbe %r18w,0x123(%r31,%rax,4)"); + asm volatile("movbe %r25d,%edx"); + asm volatile("movbe %r15d,%edx"); + asm volatile("movbe %r25d,0x123(%r16,%rax,4)"); + asm volatile("movbe %r31,%r15"); + asm volatile("movbe %r8,%r15"); + asm volatile("movbe %r31,0x123(%r16,%rax,4)"); + asm volatile("movbe %r31,0x123(%r31,%rax,4)"); + asm volatile("movbe 0x123(%r16,%rax,4),%r31"); + asm volatile("movbe 0x123(%r31,%rax,4),%r18w"); + asm volatile("movbe 0x123(%r31,%rax,4),%r25d"); + asm volatile("movdir64b 0x123(%r31d,%eax,4),%r25d"); + asm volatile("movdir64b 0x123(%r31,%rax,4),%r31"); + asm volatile("movdiri %r25d,0x123(%r31,%rax,4)"); + asm volatile("movdiri %r31,0x123(%r31,%rax,4)"); + asm volatile("pdep %r25d,%edx,%r10d"); + asm volatile("pdep %r31,%r15,%r11"); + asm volatile("pdep 0x123(%r31,%rax,4),%r25d,%edx"); + asm volatile("pdep 0x123(%r31,%rax,4),%r31,%r15"); + asm volatile("pext %r25d,%edx,%r10d"); + asm volatile("pext %r31,%r15,%r11"); + asm volatile("pext 0x123(%r31,%rax,4),%r25d,%edx"); + asm volatile("pext 0x123(%r31,%rax,4),%r31,%r15"); + asm volatile("shlx %r25d,%edx,%r10d"); + asm volatile("shlx %r25d,0x123(%r31,%rax,4),%edx"); + asm volatile("shlx %r31,%r15,%r11"); + asm volatile("shlx %r31,0x123(%r31,%rax,4),%r15"); + asm volatile("shrx %r25d,%edx,%r10d"); + asm volatile("shrx %r25d,0x123(%r31,%rax,4),%edx"); + asm volatile("shrx %r31,%r15,%r11"); + asm volatile("shrx %r31,0x123(%r31,%rax,4),%r15"); + asm volatile("sttilecfg 0x123(%r31,%rax,4)"); + asm volatile("tileloadd 0x123(%r31,%rax,4),%tmm6"); + asm volatile("tileloaddt1 0x123(%r31,%rax,4),%tmm6"); + asm volatile("tilestored %tmm6,0x123(%r31,%rax,4)"); + asm volatile("vbroadcastf128 (%r16),%ymm3"); + asm volatile("vbroadcasti128 (%r16),%ymm3"); + asm volatile("vextractf128 $1,%ymm3,(%r16)"); + asm volatile("vextracti128 $1,%ymm3,(%r16)"); + asm volatile("vinsertf128 $1,(%r16),%ymm3,%ymm8"); + asm volatile("vinserti128 $1,(%r16),%ymm3,%ymm8"); + asm volatile("vroundpd $1,(%r24),%xmm6"); + asm volatile("vroundps $2,(%r24),%xmm6"); + asm volatile("vroundsd $3,(%r24),%xmm6,%xmm3"); + asm volatile("vroundss $4,(%r24),%xmm6,%xmm3"); + asm volatile("wrssd %r25d,0x123(%r31,%rax,4)"); + asm volatile("wrssq %r31,0x123(%r31,%rax,4)"); + asm volatile("wrussd %r25d,0x123(%r31,%rax,4)"); + asm volatile("wrussq %r31,0x123(%r31,%rax,4)"); + + /* APX new data destination */ + + asm volatile("adc $0x1234,%ax,%r30w"); + asm volatile("adc %r15b,%r17b,%r18b"); + asm volatile("adc %r15d,(%r8),%r18d"); + asm volatile("adc (%r15,%rax,1),%r16b,%r8b"); + asm volatile("adc (%r15,%rax,1),%r16w,%r8w"); + asm volatile("adcl $0x11,(%r19,%rax,4),%r20d"); + asm volatile("adcx %r15d,%r8d,%r18d"); + asm volatile("adcx (%r15,%r31,1),%r8"); + asm volatile("adcx (%r15,%r31,1),%r8d,%r18d"); + asm volatile("add $0x1234,%ax,%r30w"); + asm volatile("add $0x12344433,%r15,%r16"); + asm volatile("add $0x34,%r13b,%r17b"); + asm volatile("add $0xfffffffff4332211,%rax,%r8"); + asm volatile("add %r31,%r8,%r16"); + asm volatile("add %r31,(%r8),%r16"); + asm volatile("add %r31,(%r8,%r16,8),%r16"); + asm volatile("add %r31b,%r8b,%r16b"); + asm volatile("add %r31d,%r8d,%r16d"); + asm volatile("add %r31w,%r8w,%r16w"); + asm volatile("add (%r31),%r8,%r16"); + asm volatile("add 0x9090(%r31,%r16,1),%r8,%r16"); + asm volatile("addb %r31b,%r8b,%r16b"); + asm volatile("addl %r31d,%r8d,%r16d"); + asm volatile("addl $0x11,(%r19,%rax,4),%r20d"); + asm volatile("addq %r31,%r8,%r16"); + asm volatile("addq $0x12344433,(%r15,%rcx,4),%r16"); + asm volatile("addw %r31w,%r8w,%r16w"); + asm volatile("adox %r15d,%r8d,%r18d"); + asm volatile("{load} add %r31,%r8,%r16"); + asm volatile("{store} add %r31,%r8,%r16"); + asm volatile("adox (%r15,%r31,1),%r8"); + asm volatile("adox (%r15,%r31,1),%r8d,%r18d"); + asm volatile("and $0x1234,%ax,%r30w"); + asm volatile("and %r15b,%r17b,%r18b"); + asm volatile("and %r15d,(%r8),%r18d"); + asm volatile("and (%r15,%rax,1),%r16b,%r8b"); + asm volatile("and (%r15,%rax,1),%r16w,%r8w"); + asm volatile("andl $0x11,(%r19,%rax,4),%r20d"); + asm volatile("cmova 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovae 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovb 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovbe 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmove 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovg 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovge 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovl 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovle 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovne 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovno 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovnp 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovns 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovo 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovp 0x90909090(%eax),%edx,%r8d"); + asm volatile("cmovs 0x90909090(%eax),%edx,%r8d"); + asm volatile("dec %rax,%r17"); + asm volatile("decb (%r31,%r12,1),%r8b"); + asm volatile("imul 0x909(%rax,%r31,8),%rdx,%r25"); + asm volatile("imul 0x90909(%eax),%edx,%r8d"); + asm volatile("inc %r31,%r16"); + asm volatile("inc %r31,%r8"); + asm volatile("inc %rax,%rbx"); + asm volatile("neg %rax,%r17"); + asm volatile("negb (%r31,%r12,1),%r8b"); + asm volatile("not %rax,%r17"); + asm volatile("notb (%r31,%r12,1),%r8b"); + asm volatile("or $0x1234,%ax,%r30w"); + asm volatile("or %r15b,%r17b,%r18b"); + asm volatile("or %r15d,(%r8),%r18d"); + asm volatile("or (%r15,%rax,1),%r16b,%r8b"); + asm volatile("or (%r15,%rax,1),%r16w,%r8w"); + asm volatile("orl $0x11,(%r19,%rax,4),%r20d"); + asm volatile("rcl $0x2,%r12b,%r31b"); + asm volatile("rcl %cl,%r16b,%r8b"); + asm volatile("rclb $0x1,(%rax),%r31b"); + asm volatile("rcll $0x2,(%rax),%r31d"); + asm volatile("rclw $0x1,(%rax),%r31w"); + asm volatile("rclw %cl,(%r19,%rax,4),%r31w"); + asm volatile("rcr $0x2,%r12b,%r31b"); + asm volatile("rcr %cl,%r16b,%r8b"); + asm volatile("rcrb $0x1,(%rax),%r31b"); + asm volatile("rcrl $0x2,(%rax),%r31d"); + asm volatile("rcrw $0x1,(%rax),%r31w"); + asm volatile("rcrw %cl,(%r19,%rax,4),%r31w"); + asm volatile("rol $0x2,%r12b,%r31b"); + asm volatile("rol %cl,%r16b,%r8b"); + asm volatile("rolb $0x1,(%rax),%r31b"); + asm volatile("roll $0x2,(%rax),%r31d"); + asm volatile("rolw $0x1,(%rax),%r31w"); + asm volatile("rolw %cl,(%r19,%rax,4),%r31w"); + asm volatile("ror $0x2,%r12b,%r31b"); + asm volatile("ror %cl,%r16b,%r8b"); + asm volatile("rorb $0x1,(%rax),%r31b"); + asm volatile("rorl $0x2,(%rax),%r31d"); + asm volatile("rorw $0x1,(%rax),%r31w"); + asm volatile("rorw %cl,(%r19,%rax,4),%r31w"); + asm volatile("sar $0x2,%r12b,%r31b"); + asm volatile("sar %cl,%r16b,%r8b"); + asm volatile("sarb $0x1,(%rax),%r31b"); + asm volatile("sarl $0x2,(%rax),%r31d"); + asm volatile("sarw $0x1,(%rax),%r31w"); + asm volatile("sarw %cl,(%r19,%rax,4),%r31w"); + asm volatile("sbb $0x1234,%ax,%r30w"); + asm volatile("sbb %r15b,%r17b,%r18b"); + asm volatile("sbb %r15d,(%r8),%r18d"); + asm volatile("sbb (%r15,%rax,1),%r16b,%r8b"); + asm volatile("sbb (%r15,%rax,1),%r16w,%r8w"); + asm volatile("sbbl $0x11,(%r19,%rax,4),%r20d"); + asm volatile("shl $0x2,%r12b,%r31b"); + asm volatile("shl $0x2,%r12b,%r31b"); + asm volatile("shl %cl,%r16b,%r8b"); + asm volatile("shl %cl,%r16b,%r8b"); + asm volatile("shlb $0x1,(%rax),%r31b"); + asm volatile("shlb $0x1,(%rax),%r31b"); + asm volatile("shld $0x1,%r12,(%rax),%r31"); + asm volatile("shld $0x2,%r15d,(%rax),%r31d"); + asm volatile("shld $0x2,%r8w,%r12w,%r31w"); + asm volatile("shld %cl,%r12,%r16,%r8"); + asm volatile("shld %cl,%r13w,(%r19,%rax,4),%r31w"); + asm volatile("shld %cl,%r9w,(%rax),%r31w"); + asm volatile("shll $0x2,(%rax),%r31d"); + asm volatile("shll $0x2,(%rax),%r31d"); + asm volatile("shlw $0x1,(%rax),%r31w"); + asm volatile("shlw $0x1,(%rax),%r31w"); + asm volatile("shlw %cl,(%r19,%rax,4),%r31w"); + asm volatile("shlw %cl,(%r19,%rax,4),%r31w"); + asm volatile("shr $0x2,%r12b,%r31b"); + asm volatile("shr %cl,%r16b,%r8b"); + asm volatile("shrb $0x1,(%rax),%r31b"); + asm volatile("shrd $0x1,%r12,(%rax),%r31"); + asm volatile("shrd $0x2,%r15d,(%rax),%r31d"); + asm volatile("shrd $0x2,%r8w,%r12w,%r31w"); + asm volatile("shrd %cl,%r12,%r16,%r8"); + asm volatile("shrd %cl,%r13w,(%r19,%rax,4),%r31w"); + asm volatile("shrd %cl,%r9w,(%rax),%r31w"); + asm volatile("shrl $0x2,(%rax),%r31d"); + asm volatile("shrw $0x1,(%rax),%r31w"); + asm volatile("shrw %cl,(%r19,%rax,4),%r31w"); + asm volatile("sub $0x1234,%ax,%r30w"); + asm volatile("sub %r15b,%r17b,%r18b"); + asm volatile("sub %r15d,(%r8),%r18d"); + asm volatile("sub (%r15,%rax,1),%r16b,%r8b"); + asm volatile("sub (%r15,%rax,1),%r16w,%r8w"); + asm volatile("subl $0x11,(%r19,%rax,4),%r20d"); + asm volatile("xor $0x1234,%ax,%r30w"); + asm volatile("xor %r15b,%r17b,%r18b"); + asm volatile("xor %r15d,(%r8),%r18d"); + asm volatile("xor (%r15,%rax,1),%r16b,%r8b"); + asm volatile("xor (%r15,%rax,1),%r16w,%r8w"); + asm volatile("xorl $0x11,(%r19,%rax,4),%r20d"); + + /* APX suppress status flags */ + + asm volatile("{nf} add %bl,%dl,%r8b"); + asm volatile("{nf} add %dx,%ax,%r9w"); + asm volatile("{nf} add 0x123(%r8,%rax,4),%bl,%dl"); + asm volatile("{nf} add 0x123(%r8,%rax,4),%dx,%ax"); + asm volatile("{nf} or %bl,%dl,%r8b"); + asm volatile("{nf} or %dx,%ax,%r9w"); + asm volatile("{nf} or 0x123(%r8,%rax,4),%bl,%dl"); + asm volatile("{nf} or 0x123(%r8,%rax,4),%dx,%ax"); + asm volatile("{nf} and %bl,%dl,%r8b"); + asm volatile("{nf} and %dx,%ax,%r9w"); + asm volatile("{nf} and 0x123(%r8,%rax,4),%bl,%dl"); + asm volatile("{nf} and 0x123(%r8,%rax,4),%dx,%ax"); + asm volatile("{nf} shld $0x7b,%dx,%ax,%r9w"); + asm volatile("{nf} sub %bl,%dl,%r8b"); + asm volatile("{nf} sub %dx,%ax,%r9w"); + asm volatile("{nf} sub 0x123(%r8,%rax,4),%bl,%dl"); + asm volatile("{nf} sub 0x123(%r8,%rax,4),%dx,%ax"); + asm volatile("{nf} shrd $0x7b,%dx,%ax,%r9w"); + asm volatile("{nf} xor %bl,%dl,%r8b"); + asm volatile("{nf} xor %r31,%r31"); + asm volatile("{nf} xor 0x123(%r8,%rax,4),%bl,%dl"); + asm volatile("{nf} xor 0x123(%r8,%rax,4),%dx,%ax"); + asm volatile("{nf} imul $0xff90,%r9,%r15"); + asm volatile("{nf} imul $0x7b,%r9,%r15"); + asm volatile("{nf} xor $0x7b,%bl,%dl"); + asm volatile("{nf} xor $0x7b,%dx,%ax"); + asm volatile("{nf} popcnt %r9,%r31"); + asm volatile("{nf} shld %cl,%dx,%ax,%r9w"); + asm volatile("{nf} shrd %cl,%dx,%ax,%r9w"); + asm volatile("{nf} imul %r9,%r31,%r11"); + asm volatile("{nf} sar $0x7b,%bl,%dl"); + asm volatile("{nf} sar $0x7b,%dx,%ax"); + asm volatile("{nf} sar $1,%bl,%dl"); + asm volatile("{nf} sar $1,%dx,%ax"); + asm volatile("{nf} sar %cl,%bl,%dl"); + asm volatile("{nf} sar %cl,%dx,%ax"); + asm volatile("{nf} andn %r9,%r31,%r11"); + asm volatile("{nf} blsi %r9,%r31"); + asm volatile("{nf} tzcnt %r9,%r31"); + asm volatile("{nf} lzcnt %r9,%r31"); + asm volatile("{nf} idiv %bl"); + asm volatile("{nf} idiv %dx"); + asm volatile("{nf} dec %bl,%dl"); + asm volatile("{nf} dec %dx,%ax"); + #else /* #ifdef __x86_64__ */ /* bound r32, mem (same op code as EVEX prefix) */ @@ -4848,6 +5354,97 @@ int main(void) #endif /* #ifndef __x86_64__ */ + /* Key Locker */ + + asm volatile(" loadiwkey %xmm1, %xmm2"); + asm volatile(" encodekey128 %eax, %edx"); + asm volatile(" encodekey256 %eax, %edx"); + asm volatile(" aesenc128kl 0x77(%edx), %xmm3"); + asm volatile(" aesenc256kl 0x77(%edx), %xmm3"); + asm volatile(" aesdec128kl 0x77(%edx), %xmm3"); + asm volatile(" aesdec256kl 0x77(%edx), %xmm3"); + asm volatile(" aesencwide128kl 0x77(%edx)"); + asm volatile(" aesencwide256kl 0x77(%edx)"); + asm volatile(" aesdecwide128kl 0x77(%edx)"); + asm volatile(" aesdecwide256kl 0x77(%edx)"); + + /* Remote Atomic Operations */ + + asm volatile("aadd %ecx,(%eax)"); + asm volatile("aadd %edx,(0x12345678)"); + asm volatile("aadd %edx,0x12345678(%eax,%ecx,8)"); + + asm volatile("aand %ecx,(%eax)"); + asm volatile("aand %edx,(0x12345678)"); + asm volatile("aand %edx,0x12345678(%eax,%ecx,8)"); + + asm volatile("aor %ecx,(%eax)"); + asm volatile("aor %edx,(0x12345678)"); + asm volatile("aor %edx,0x12345678(%eax,%ecx,8)"); + + asm volatile("axor %ecx,(%eax)"); + asm volatile("axor %edx,(0x12345678)"); + asm volatile("axor %edx,0x12345678(%eax,%ecx,8)"); + + /* AVX NE Convert */ + + asm volatile("vbcstnebf162ps (%ecx),%xmm6"); + asm volatile("vbcstnesh2ps (%ecx),%xmm6"); + asm volatile("vcvtneebf162ps (%ecx),%xmm6"); + asm volatile("vcvtneeph2ps (%ecx),%xmm6"); + asm volatile("vcvtneobf162ps (%ecx),%xmm6"); + asm volatile("vcvtneoph2ps (%ecx),%xmm6"); + asm volatile("vcvtneps2bf16 %xmm1,%xmm6"); + + /* AVX VNNI INT16 */ + + asm volatile("vpdpbssd %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpbssds %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpbsud %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpbsuds %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpbuud %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpbuuds %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpwsud %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpwsuds %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpwusd %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpwusds %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpwuud %xmm1,%xmm2,%xmm3"); + asm volatile("vpdpwuuds %xmm1,%xmm2,%xmm3"); + + /* AVX IFMA */ + + asm volatile("vpmadd52huq %xmm1,%xmm2,%xmm3"); + asm volatile("vpmadd52luq %xmm1,%xmm2,%xmm3"); + + /* AVX SHA512 */ + + asm volatile("vsha512msg1 %xmm1,%ymm2"); + asm volatile("vsha512msg2 %ymm1,%ymm2"); + asm volatile("vsha512rnds2 %xmm1,%ymm2,%ymm3"); + + /* AVX SM3 */ + + asm volatile("vsm3msg1 %xmm1,%xmm2,%xmm3"); + asm volatile("vsm3msg2 %xmm1,%xmm2,%xmm3"); + asm volatile("vsm3rnds2 $0xa1,%xmm1,%xmm2,%xmm3"); + + /* AVX SM4 */ + + asm volatile("vsm4key4 %xmm1,%xmm2,%xmm3"); + asm volatile("vsm4rnds4 %xmm1,%xmm2,%xmm3"); + + /* Pre-fetch */ + + asm volatile("prefetch (%eax)"); + asm volatile("prefetcht0 (%eax)"); + asm volatile("prefetcht1 (%eax)"); + asm volatile("prefetcht2 (%eax)"); + asm volatile("prefetchnta (%eax)"); + + /* Non-serializing write MSR */ + + asm volatile("wrmsrns"); + /* Prediction history reset */ asm volatile("hreset $0"); -- 2.34.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: (subset) [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter ` (9 preceding siblings ...) 2024-05-02 10:58 ` [PATCH 10/10] perf tests: Add APX and other new instructions to x86 instruction decoder test Adrian Hunter @ 2024-06-26 3:59 ` Namhyung Kim 10 siblings, 0 replies; 18+ messages in thread From: Namhyung Kim @ 2024-06-26 3:59 UTC (permalink / raw) To: linux-kernel, Adrian Hunter Cc: Chang S. Bae, Masami Hiramatsu, Nikolay Borisov, Borislav Petkov, Ingo Molnar, H. Peter Anvin, Dave Hansen, Thomas Gleixner, x86, Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers, linux-perf-users On Thu, 02 May 2024 13:58:43 +0300, Adrian Hunter wrote: > The x86 instruction decoder is used not only for decoding kernel > instructions. It is also used by perf uprobes (user space probes) and by > perf tools Intel Processor Trace decoding. Consequently, it needs to > support instructions executed by user space also. > > It should be noted that there are 2 copies of the instruction decoder. > One for the kernel and one for tools, which is a policy to prevent > changes from breaking builds. > > [...] Applied patch 9 and 10 to perf-tools-next, thanks! Best regards, Namhyung ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-06-26 3:59 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-02 10:58 [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Adrian Hunter 2024-05-02 10:58 ` [PATCH 01/10] x86/insn: Add Key Locker instructions to the opcode map Adrian Hunter 2024-05-02 10:58 ` [PATCH 02/10] x86/insn: Fix PUSH instruction in x86 instruction decoder " Adrian Hunter 2024-05-02 13:41 ` Arnaldo Carvalho de Melo 2024-05-02 19:11 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 03/10] x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS Adrian Hunter 2024-05-02 10:58 ` [PATCH 04/10] x86/insn: Add misc new Intel instructions Adrian Hunter 2024-05-02 10:58 ` [PATCH 05/10] x86/insn: Add support for REX2 prefix to the instruction decoder logic Adrian Hunter 2024-05-02 18:10 ` Ian Rogers 2024-05-03 5:09 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 06/10] x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map Adrian Hunter 2024-05-02 10:58 ` [PATCH 07/10] x86/insn: Add support for APX EVEX to the instruction decoder logic Adrian Hunter 2024-05-02 10:58 ` [PATCH 08/10] x86/insn: Add support for APX EVEX instructions to the opcode map Adrian Hunter 2024-05-02 10:58 ` [PATCH 09/10] perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder Adrian Hunter 2024-05-28 18:14 ` Adrian Hunter 2024-06-18 8:05 ` Adrian Hunter 2024-05-02 10:58 ` [PATCH 10/10] perf tests: Add APX and other new instructions to x86 instruction decoder test Adrian Hunter 2024-06-26 3:59 ` (subset) [PATCH 00/10] perf intel pt: Update instruction decoder for APX and other new instructions Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).