qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions
@ 2013-03-31 11:02 Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 1/7] disas/i386.c: disassemble pclmulqdq instruction Aurelien Jarno
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

This patch series adds the PCLMULQDQ and AES-NI instructions to the x86
emulation. Along with the SSE4.1 and SSE4.2 series, this brings the
instructions emulation to the level of a Westmere CPU.

It has been tested with the valgrind testsuite and with the kernel
autotest.

Changes v1 -> v2:
- Patch 3: Declare all constant tables as static.

Changes v2 -> v3:
- Use constant tables from aes.c.
- Fix AES instructions when source and destination registers are the
  same.

Changes v3 -> v4:
- Update dissassembler code to support these instructions.

Aurelien Jarno (7):
  disas/i386.c: disassemble pclmulqdq instruction
  target-i386: add pclmulqdq instruction
  target-i386: enable PCLMULQDQ on Westmere CPU
  disas/i386.c: disassemble aes-ni instructions
  aes: move aes.h from include/block to include/qemu
  aes: make Td[0-5] and Te[0-5] tables non static
  target-i386: add AES-NI instructions

 block/qcow.c                 |    2 +-
 block/qcow2.c                |    2 +-
 block/qcow2.h                |    2 +-
 disas/i386.c                 |   84 ++++++-
 include/block/aes.h          |   26 ---
 include/qemu/aes.h           |   45 ++++
 target-i386/cpu.c            |   19 +-
 target-i386/fpu_helper.c     |    1 +
 target-i386/ops_sse.h        |  111 +++++++++
 target-i386/ops_sse_header.h |   11 +
 target-i386/translate.c      |   10 +
 util/aes.c                   |  506 +++++++++++++++++++++---------------------
 12 files changed, 517 insertions(+), 302 deletions(-)
 delete mode 100644 include/block/aes.h
 create mode 100644 include/qemu/aes.h

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 1/7] disas/i386.c: disassemble pclmulqdq instruction
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
@ 2013-03-31 11:02 ` Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 2/7] target-i386: add " Aurelien Jarno
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 disas/i386.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/disas/i386.c b/disas/i386.c
index 73cc06f..c52efbc 100644
--- a/disas/i386.c
+++ b/disas/i386.c
@@ -664,6 +664,7 @@ fetch_data(struct disassemble_info *info, bfd_byte *addr)
 #define PREGRP95  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 95 } }
 #define PREGRP96  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 96 } }
 #define PREGRP97  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 97 } }
+#define PREGRP98  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 98 } }
 
 
 #define X86_64_0  NULL, { { NULL, X86_64_SPECIAL }, { NULL, 0 } }
@@ -1503,7 +1504,7 @@ static const unsigned char threebyte_0x3a_uses_DATA_prefix[256] = {
   /* 10 */ 0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,0, /* 1f */
   /* 20 */ 1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0, /* 2f */
   /* 30 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* 3f */
-  /* 40 */ 1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0, /* 4f */
+  /* 40 */ 1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0, /* 4f */
   /* 50 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* 5f */
   /* 60 */ 1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0, /* 6f */
   /* 70 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* 7f */
@@ -2710,6 +2711,14 @@ static const struct dis386 prefix_user_table[][4] = {
     { "punpckldq",{ MX, EMq } },
     { "(bad)",	{ XX } },
   },
+
+  /* PREGRP98 */
+  {
+    { "(bad)",	{ XX } },
+    { "(bad)",	{ XX } },
+    { "pclmulqdq", { XM, EXx, Ib } },
+    { "(bad)",	{ XX } },
+  },
 };
 
 static const struct dis386 x86_64_table[][2] = {
@@ -3102,7 +3111,7 @@ static const struct dis386 three_byte_table[][256] = {
     { PREGRP84 },
     { PREGRP85 },
     { "(bad)", { XX } },
-    { "(bad)", { XX } },
+    { PREGRP98 },
     { "(bad)", { XX } },
     { "(bad)", { XX } },
     { "(bad)", { XX } },
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 2/7] target-i386: add pclmulqdq instruction
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 1/7] disas/i386.c: disassemble pclmulqdq instruction Aurelien Jarno
@ 2013-03-31 11:02 ` Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 3/7] target-i386: enable PCLMULQDQ on Westmere CPU Aurelien Jarno
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/cpu.c            |   19 +++++++++----------
 target-i386/ops_sse.h        |   24 ++++++++++++++++++++++++
 target-i386/ops_sse_header.h |    5 +++++
 target-i386/translate.c      |    3 +++
 4 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 4b43759..41382c5 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -387,17 +387,16 @@ typedef struct x86_def_t {
           CPUID_PSE36 (needed for Solaris) */
           /* missing:
           CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_HT, CPUID_TM, CPUID_PBE */
-#define TCG_EXT_FEATURES (CPUID_EXT_SSE3 | CPUID_EXT_MONITOR | \
-          CPUID_EXT_SSSE3 | CPUID_EXT_CX16 | CPUID_EXT_SSE41 | \
-          CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | CPUID_EXT_MOVBE | \
-          CPUID_EXT_HYPERVISOR)
+#define TCG_EXT_FEATURES (CPUID_EXT_SSE3 | CPUID_EXT_PCLMULQDQ | \
+          CPUID_EXT_MONITOR | CPUID_EXT_SSSE3 | CPUID_EXT_CX16 | \
+          CPUID_EXT_SSE41 | CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | \
+          CPUID_EXT_MOVBE | CPUID_EXT_HYPERVISOR)
           /* missing:
-          CPUID_EXT_PCLMULQDQ, CPUID_EXT_DTES64, CPUID_EXT_DSCPL,
-          CPUID_EXT_VMX, CPUID_EXT_SMX, CPUID_EXT_EST, CPUID_EXT_TM2,
-          CPUID_EXT_CID, CPUID_EXT_FMA, CPUID_EXT_XTPR, CPUID_EXT_PDCM,
-          CPUID_EXT_PCID, CPUID_EXT_DCA, CPUID_EXT_X2APIC,
-          CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AES, CPUID_EXT_XSAVE,
-          CPUID_EXT_OSXSAVE, CPUID_EXT_AVX, CPUID_EXT_F16C,
+          CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX,
+          CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID, CPUID_EXT_FMA,
+          CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA,
+          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AES,
+          CPUID_EXT_XSAVE, CPUID_EXT_OSXSAVE, CPUID_EXT_AVX, CPUID_EXT_F16C,
           CPUID_EXT_RDRAND */
 #define TCG_EXT2_FEATURES ((TCG_FEATURES & CPUID_EXT2_AMD_ALIASES) | \
           CPUID_EXT2_NX | CPUID_EXT2_MMXEXT | CPUID_EXT2_RDTSCP | \
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index a11dba1..2ee5b8d 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2179,6 +2179,30 @@ target_ulong helper_popcnt(CPUX86State *env, target_ulong n, uint32_t type)
     return POPCOUNT(n, 5);
 #endif
 }
+
+void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+                                    uint32_t ctrl)
+{
+    uint64_t ah, al, b, resh, resl;
+
+    ah = 0;
+    al = d->Q((ctrl & 1) != 0);
+    b = s->Q((ctrl & 16) != 0);
+    resh = resl = 0;
+
+    while (b) {
+        if (b & 1) {
+            resl ^= al;
+            resh ^= ah;
+        }
+        ah = (ah << 1) | (al >> 63);
+        al <<= 1;
+        b >>= 1;
+    }
+
+    d->Q(0) = resl;
+    d->Q(1) = resh;
+}
 #endif
 
 #undef SHIFT
diff --git a/target-i386/ops_sse_header.h b/target-i386/ops_sse_header.h
index 401eac6..2842233 100644
--- a/target-i386/ops_sse_header.h
+++ b/target-i386/ops_sse_header.h
@@ -336,6 +336,11 @@ DEF_HELPER_3(crc32, tl, i32, tl, i32)
 DEF_HELPER_3(popcnt, tl, env, tl, i32)
 #endif
 
+/* AES-NI op helpers */
+#if SHIFT == 1
+DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32)
+#endif
+
 #undef SHIFT
 #undef Reg
 #undef SUFFIX
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 7596a90..d649e99 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -3147,6 +3147,8 @@ struct SSEOpHelper_eppi {
 #define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 }
 #define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 }
 #define SSE41_SPECIAL { { NULL, SSE_SPECIAL }, CPUID_EXT_SSE41 }
+#define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \
+        CPUID_EXT_PCLMULQDQ }
 
 static const struct SSEOpHelper_epp sse_op_table6[256] = {
     [0x00] = SSSE3_OP(pshufb),
@@ -3216,6 +3218,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = {
     [0x40] = SSE41_OP(dpps),
     [0x41] = SSE41_OP(dppd),
     [0x42] = SSE41_OP(mpsadbw),
+    [0x44] = PCLMULQDQ_OP(pclmulqdq),
     [0x60] = SSE42_OP(pcmpestrm),
     [0x61] = SSE42_OP(pcmpestri),
     [0x62] = SSE42_OP(pcmpistrm),
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 3/7] target-i386: enable PCLMULQDQ on Westmere CPU
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 1/7] disas/i386.c: disassemble pclmulqdq instruction Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 2/7] target-i386: add " Aurelien Jarno
@ 2013-03-31 11:02 ` Aurelien Jarno
  2013-08-07 22:12   ` Eduardo Habkost
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 4/7] disas/i386.c: disassemble aes-ni instructions Aurelien Jarno
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

The PCLMULQDQ instruction has been introduced on the Westmere CPU.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-i386/cpu.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 41382c5..5941d40 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -687,7 +687,7 @@ static x86_def_t builtin_x86_defs[] = {
              CPUID_DE | CPUID_FP87,
         .ext_features = CPUID_EXT_AES | CPUID_EXT_POPCNT | CPUID_EXT_SSE42 |
              CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
-             CPUID_EXT_SSE3,
+             CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3,
         .ext2_features = CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
         .ext3_features = CPUID_EXT3_LAHF_LM,
         .xlevel = 0x8000000A,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 4/7] disas/i386.c: disassemble aes-ni instructions
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
                   ` (2 preceding siblings ...)
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 3/7] target-i386: enable PCLMULQDQ on Westmere CPU Aurelien Jarno
@ 2013-03-31 11:02 ` Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 5/7] aes: move aes.h from include/block to include/qemu Aurelien Jarno
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 disas/i386.c |   67 ++++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 61 insertions(+), 6 deletions(-)

diff --git a/disas/i386.c b/disas/i386.c
index c52efbc..04c033c 100644
--- a/disas/i386.c
+++ b/disas/i386.c
@@ -665,6 +665,12 @@ fetch_data(struct disassemble_info *info, bfd_byte *addr)
 #define PREGRP96  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 96 } }
 #define PREGRP97  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 97 } }
 #define PREGRP98  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 98 } }
+#define PREGRP99  NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 99 } }
+#define PREGRP100 NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 100 } }
+#define PREGRP101 NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 101 } }
+#define PREGRP102 NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 102 } }
+#define PREGRP103 NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 103 } }
+#define PREGRP104 NULL, { { NULL, USE_PREFIX_USER_TABLE }, { NULL, 104 } }
 
 
 #define X86_64_0  NULL, { { NULL, X86_64_SPECIAL }, { NULL, 0 } }
@@ -2719,6 +2725,55 @@ static const struct dis386 prefix_user_table[][4] = {
     { "pclmulqdq", { XM, EXx, Ib } },
     { "(bad)",	{ XX } },
   },
+
+  /* PREGRP99 */
+  {
+    { "(bad)",	{ XX } },
+    { "(bad)",	{ XX } },
+    { "aesimc", { XM, EXx } },
+    { "(bad)",	{ XX } },
+  },
+
+  /* PREGRP100 */
+  {
+    { "(bad)",	{ XX } },
+    { "(bad)",	{ XX } },
+    { "aesenc", { XM, EXx } },
+    { "(bad)",	{ XX } },
+  },
+
+  /* PREGRP101 */
+  {
+    { "(bad)",	{ XX } },
+    { "(bad)",	{ XX } },
+    { "aesenclast", { XM, EXx } },
+    { "(bad)",	{ XX } },
+  },
+
+  /* PREGRP102 */
+  {
+    { "(bad)",	{ XX } },
+    { "(bad)",	{ XX } },
+    { "aesdec", { XM, EXx } },
+    { "(bad)",	{ XX } },
+  },
+
+  /* PREGRP103 */
+  {
+    { "(bad)",	{ XX } },
+    { "(bad)",	{ XX } },
+    { "aesdeclast", { XM, EXx } },
+    { "(bad)",	{ XX } },
+  },
+
+  /* PREGRP104 */
+  {
+    { "(bad)",	{ XX } },
+    { "(bad)",	{ XX } },
+    { "aeskeygenassist", { XM, EXx, Ib } },
+    { "(bad)",	{ XX } },
+  },
+
 };
 
 static const struct dis386 x86_64_table[][2] = {
@@ -2990,11 +3045,11 @@ static const struct dis386 three_byte_table[][256] = {
     { "(bad)", { XX } },
     { "(bad)", { XX } },
     { "(bad)", { XX } },
-    { "(bad)", { XX } },
-    { "(bad)", { XX } },
-    { "(bad)", { XX } },
-    { "(bad)", { XX } },
-    { "(bad)", { XX } },
+    { PREGRP99 },
+    { PREGRP100 },
+    { PREGRP101 },
+    { PREGRP102 },
+    { PREGRP103 },
     /* e0 */
     { "(bad)", { XX } },
     { "(bad)", { XX } },
@@ -3285,7 +3340,7 @@ static const struct dis386 three_byte_table[][256] = {
     { "(bad)", { XX } },
     { "(bad)", { XX } },
     { "(bad)", { XX } },
-    { "(bad)", { XX } },
+    { PREGRP104 },
     /* e0 */
     { "(bad)", { XX } },
     { "(bad)", { XX } },
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 5/7] aes: move aes.h from include/block to include/qemu
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
                   ` (3 preceding siblings ...)
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 4/7] disas/i386.c: disassemble aes-ni instructions Aurelien Jarno
@ 2013-03-31 11:02 ` Aurelien Jarno
  2013-04-05 12:34   ` Stefan Hajnoczi
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 6/7] aes: make Td[0-5] and Te[0-5] tables non static Aurelien Jarno
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Aurelien Jarno

Move aes.h from include/block to include/qemu to show it can be reused
by other subsystems.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 block/qcow.c        |    2 +-
 block/qcow2.c       |    2 +-
 block/qcow2.h       |    2 +-
 include/block/aes.h |   26 --------------------------
 include/qemu/aes.h  |   26 ++++++++++++++++++++++++++
 util/aes.c          |    2 +-
 6 files changed, 30 insertions(+), 30 deletions(-)
 delete mode 100644 include/block/aes.h
 create mode 100644 include/qemu/aes.h

diff --git a/block/qcow.c b/block/qcow.c
index 13d396b..3278e55 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -25,7 +25,7 @@
 #include "block/block_int.h"
 #include "qemu/module.h"
 #include <zlib.h>
-#include "block/aes.h"
+#include "qemu/aes.h"
 #include "migration/migration.h"
 
 /**************************************************************/
diff --git a/block/qcow2.c b/block/qcow2.c
index 7e7d775..1d18073 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -25,7 +25,7 @@
 #include "block/block_int.h"
 #include "qemu/module.h"
 #include <zlib.h>
-#include "block/aes.h"
+#include "qemu/aes.h"
 #include "block/qcow2.h"
 #include "qemu/error-report.h"
 #include "qapi/qmp/qerror.h"
diff --git a/block/qcow2.h b/block/qcow2.h
index bf8db2a..9421843 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -25,7 +25,7 @@
 #ifndef BLOCK_QCOW2_H
 #define BLOCK_QCOW2_H
 
-#include "block/aes.h"
+#include "qemu/aes.h"
 #include "block/coroutine.h"
 
 //#define DEBUG_ALLOC
diff --git a/include/block/aes.h b/include/block/aes.h
deleted file mode 100644
index a0167eb..0000000
--- a/include/block/aes.h
+++ /dev/null
@@ -1,26 +0,0 @@
-#ifndef QEMU_AES_H
-#define QEMU_AES_H
-
-#define AES_MAXNR 14
-#define AES_BLOCK_SIZE 16
-
-struct aes_key_st {
-    uint32_t rd_key[4 *(AES_MAXNR + 1)];
-    int rounds;
-};
-typedef struct aes_key_st AES_KEY;
-
-int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
-	AES_KEY *key);
-int AES_set_decrypt_key(const unsigned char *userKey, const int bits,
-	AES_KEY *key);
-
-void AES_encrypt(const unsigned char *in, unsigned char *out,
-	const AES_KEY *key);
-void AES_decrypt(const unsigned char *in, unsigned char *out,
-	const AES_KEY *key);
-void AES_cbc_encrypt(const unsigned char *in, unsigned char *out,
-		     const unsigned long length, const AES_KEY *key,
-		     unsigned char *ivec, const int enc);
-
-#endif
diff --git a/include/qemu/aes.h b/include/qemu/aes.h
new file mode 100644
index 0000000..a0167eb
--- /dev/null
+++ b/include/qemu/aes.h
@@ -0,0 +1,26 @@
+#ifndef QEMU_AES_H
+#define QEMU_AES_H
+
+#define AES_MAXNR 14
+#define AES_BLOCK_SIZE 16
+
+struct aes_key_st {
+    uint32_t rd_key[4 *(AES_MAXNR + 1)];
+    int rounds;
+};
+typedef struct aes_key_st AES_KEY;
+
+int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
+	AES_KEY *key);
+int AES_set_decrypt_key(const unsigned char *userKey, const int bits,
+	AES_KEY *key);
+
+void AES_encrypt(const unsigned char *in, unsigned char *out,
+	const AES_KEY *key);
+void AES_decrypt(const unsigned char *in, unsigned char *out,
+	const AES_KEY *key);
+void AES_cbc_encrypt(const unsigned char *in, unsigned char *out,
+		     const unsigned long length, const AES_KEY *key,
+		     unsigned char *ivec, const int enc);
+
+#endif
diff --git a/util/aes.c b/util/aes.c
index 1da7bff..54c7b98 100644
--- a/util/aes.c
+++ b/util/aes.c
@@ -28,7 +28,7 @@
  * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 #include "qemu-common.h"
-#include "block/aes.h"
+#include "qemu/aes.h"
 
 #ifndef NDEBUG
 #define NDEBUG
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 6/7] aes: make Td[0-5] and Te[0-5] tables non static
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
                   ` (4 preceding siblings ...)
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 5/7] aes: move aes.h from include/block to include/qemu Aurelien Jarno
@ 2013-03-31 11:02 ` Aurelien Jarno
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 7/7] target-i386: add AES-NI instructions Aurelien Jarno
  2013-03-31 23:26 ` [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and " Richard Henderson
  7 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Remove static attribute to Td[0-5] and Te[0-5] tables so that they
can be used outside of aes.c. Change their type from u32 to uint32_t,
to keep the u32 udef local to aes.c. Prefix them with AES_ so that they
do not conflict with other symbols.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 include/qemu/aes.h |   19 ++
 util/aes.c         |  504 ++++++++++++++++++++++++++--------------------------
 2 files changed, 271 insertions(+), 252 deletions(-)

Note: while renaming Td/Te tables, I replaced tabs with spaces, but kept
the remaining code style to keep the unrolled loops readable.

diff --git a/include/qemu/aes.h b/include/qemu/aes.h
index a0167eb..e79c707 100644
--- a/include/qemu/aes.h
+++ b/include/qemu/aes.h
@@ -23,4 +23,23 @@ void AES_cbc_encrypt(const unsigned char *in, unsigned char *out,
 		     const unsigned long length, const AES_KEY *key,
 		     unsigned char *ivec, const int enc);
 
+/*
+AES_Te0[x] = S [x].[02, 01, 01, 03];
+AES_Te1[x] = S [x].[03, 02, 01, 01];
+AES_Te2[x] = S [x].[01, 03, 02, 01];
+AES_Te3[x] = S [x].[01, 01, 03, 02];
+AES_Te4[x] = S [x].[01, 01, 01, 01];
+
+AES_Td0[x] = Si[x].[0e, 09, 0d, 0b];
+AES_Td1[x] = Si[x].[0b, 0e, 09, 0d];
+AES_Td2[x] = Si[x].[0d, 0b, 0e, 09];
+AES_Td3[x] = Si[x].[09, 0d, 0b, 0e];
+AES_Td4[x] = Si[x].[01, 01, 01, 01];
+*/
+
+extern const uint32_t AES_Te0[256], AES_Te1[256], AES_Te2[256],
+                      AES_Te3[256], AES_Te4[256];
+extern const uint32_t AES_Td0[256], AES_Td1[256], AES_Td2[256],
+                      AES_Td3[256], AES_Td4[256];
+
 #endif
diff --git a/util/aes.c b/util/aes.c
index 54c7b98..91e97fa 100644
--- a/util/aes.c
+++ b/util/aes.c
@@ -44,20 +44,20 @@ typedef uint8_t u8;
 # define PUTU32(ct, st) { (ct)[0] = (u8)((st) >> 24); (ct)[1] = (u8)((st) >> 16); (ct)[2] = (u8)((st) >>  8); (ct)[3] = (u8)(st); }
 
 /*
-Te0[x] = S [x].[02, 01, 01, 03];
-Te1[x] = S [x].[03, 02, 01, 01];
-Te2[x] = S [x].[01, 03, 02, 01];
-Te3[x] = S [x].[01, 01, 03, 02];
-Te4[x] = S [x].[01, 01, 01, 01];
+AES_Te0[x] = S [x].[02, 01, 01, 03];
+AES_Te1[x] = S [x].[03, 02, 01, 01];
+AES_Te2[x] = S [x].[01, 03, 02, 01];
+AES_Te3[x] = S [x].[01, 01, 03, 02];
+AES_Te4[x] = S [x].[01, 01, 01, 01];
 
-Td0[x] = Si[x].[0e, 09, 0d, 0b];
-Td1[x] = Si[x].[0b, 0e, 09, 0d];
-Td2[x] = Si[x].[0d, 0b, 0e, 09];
-Td3[x] = Si[x].[09, 0d, 0b, 0e];
-Td4[x] = Si[x].[01, 01, 01, 01];
+AES_Td0[x] = Si[x].[0e, 09, 0d, 0b];
+AES_Td1[x] = Si[x].[0b, 0e, 09, 0d];
+AES_Td2[x] = Si[x].[0d, 0b, 0e, 09];
+AES_Td3[x] = Si[x].[09, 0d, 0b, 0e];
+AES_Td4[x] = Si[x].[01, 01, 01, 01];
 */
 
-static const u32 Te0[256] = {
+const uint32_t AES_Te0[256] = {
     0xc66363a5U, 0xf87c7c84U, 0xee777799U, 0xf67b7b8dU,
     0xfff2f20dU, 0xd66b6bbdU, 0xde6f6fb1U, 0x91c5c554U,
     0x60303050U, 0x02010103U, 0xce6767a9U, 0x562b2b7dU,
@@ -123,7 +123,7 @@ static const u32 Te0[256] = {
     0x824141c3U, 0x299999b0U, 0x5a2d2d77U, 0x1e0f0f11U,
     0x7bb0b0cbU, 0xa85454fcU, 0x6dbbbbd6U, 0x2c16163aU,
 };
-static const u32 Te1[256] = {
+const uint32_t AES_Te1[256] = {
     0xa5c66363U, 0x84f87c7cU, 0x99ee7777U, 0x8df67b7bU,
     0x0dfff2f2U, 0xbdd66b6bU, 0xb1de6f6fU, 0x5491c5c5U,
     0x50603030U, 0x03020101U, 0xa9ce6767U, 0x7d562b2bU,
@@ -189,7 +189,7 @@ static const u32 Te1[256] = {
     0xc3824141U, 0xb0299999U, 0x775a2d2dU, 0x111e0f0fU,
     0xcb7bb0b0U, 0xfca85454U, 0xd66dbbbbU, 0x3a2c1616U,
 };
-static const u32 Te2[256] = {
+const uint32_t AES_Te2[256] = {
     0x63a5c663U, 0x7c84f87cU, 0x7799ee77U, 0x7b8df67bU,
     0xf20dfff2U, 0x6bbdd66bU, 0x6fb1de6fU, 0xc55491c5U,
     0x30506030U, 0x01030201U, 0x67a9ce67U, 0x2b7d562bU,
@@ -255,7 +255,7 @@ static const u32 Te2[256] = {
     0x41c38241U, 0x99b02999U, 0x2d775a2dU, 0x0f111e0fU,
     0xb0cb7bb0U, 0x54fca854U, 0xbbd66dbbU, 0x163a2c16U,
 };
-static const u32 Te3[256] = {
+const uint32_t AES_Te3[256] = {
 
     0x6363a5c6U, 0x7c7c84f8U, 0x777799eeU, 0x7b7b8df6U,
     0xf2f20dffU, 0x6b6bbdd6U, 0x6f6fb1deU, 0xc5c55491U,
@@ -322,7 +322,7 @@ static const u32 Te3[256] = {
     0x4141c382U, 0x9999b029U, 0x2d2d775aU, 0x0f0f111eU,
     0xb0b0cb7bU, 0x5454fca8U, 0xbbbbd66dU, 0x16163a2cU,
 };
-static const u32 Te4[256] = {
+const uint32_t AES_Te4[256] = {
     0x63636363U, 0x7c7c7c7cU, 0x77777777U, 0x7b7b7b7bU,
     0xf2f2f2f2U, 0x6b6b6b6bU, 0x6f6f6f6fU, 0xc5c5c5c5U,
     0x30303030U, 0x01010101U, 0x67676767U, 0x2b2b2b2bU,
@@ -388,7 +388,7 @@ static const u32 Te4[256] = {
     0x41414141U, 0x99999999U, 0x2d2d2d2dU, 0x0f0f0f0fU,
     0xb0b0b0b0U, 0x54545454U, 0xbbbbbbbbU, 0x16161616U,
 };
-static const u32 Td0[256] = {
+const uint32_t AES_Td0[256] = {
     0x51f4a750U, 0x7e416553U, 0x1a17a4c3U, 0x3a275e96U,
     0x3bab6bcbU, 0x1f9d45f1U, 0xacfa58abU, 0x4be30393U,
     0x2030fa55U, 0xad766df6U, 0x88cc7691U, 0xf5024c25U,
@@ -454,7 +454,7 @@ static const u32 Td0[256] = {
     0x39a80171U, 0x080cb3deU, 0xd8b4e49cU, 0x6456c190U,
     0x7bcb8461U, 0xd532b670U, 0x486c5c74U, 0xd0b85742U,
 };
-static const u32 Td1[256] = {
+const uint32_t AES_Td1[256] = {
     0x5051f4a7U, 0x537e4165U, 0xc31a17a4U, 0x963a275eU,
     0xcb3bab6bU, 0xf11f9d45U, 0xabacfa58U, 0x934be303U,
     0x552030faU, 0xf6ad766dU, 0x9188cc76U, 0x25f5024cU,
@@ -520,7 +520,7 @@ static const u32 Td1[256] = {
     0x7139a801U, 0xde080cb3U, 0x9cd8b4e4U, 0x906456c1U,
     0x617bcb84U, 0x70d532b6U, 0x74486c5cU, 0x42d0b857U,
 };
-static const u32 Td2[256] = {
+const uint32_t AES_Td2[256] = {
     0xa75051f4U, 0x65537e41U, 0xa4c31a17U, 0x5e963a27U,
     0x6bcb3babU, 0x45f11f9dU, 0x58abacfaU, 0x03934be3U,
     0xfa552030U, 0x6df6ad76U, 0x769188ccU, 0x4c25f502U,
@@ -587,7 +587,7 @@ static const u32 Td2[256] = {
     0x017139a8U, 0xb3de080cU, 0xe49cd8b4U, 0xc1906456U,
     0x84617bcbU, 0xb670d532U, 0x5c74486cU, 0x5742d0b8U,
 };
-static const u32 Td3[256] = {
+const uint32_t AES_Td3[256] = {
     0xf4a75051U, 0x4165537eU, 0x17a4c31aU, 0x275e963aU,
     0xab6bcb3bU, 0x9d45f11fU, 0xfa58abacU, 0xe303934bU,
     0x30fa5520U, 0x766df6adU, 0xcc769188U, 0x024c25f5U,
@@ -653,7 +653,7 @@ static const u32 Td3[256] = {
     0xa8017139U, 0x0cb3de08U, 0xb4e49cd8U, 0x56c19064U,
     0xcb84617bU, 0x32b670d5U, 0x6c5c7448U, 0xb85742d0U,
 };
-static const u32 Td4[256] = {
+const uint32_t AES_Td4[256] = {
     0x52525252U, 0x09090909U, 0x6a6a6a6aU, 0xd5d5d5d5U,
     0x30303030U, 0x36363636U, 0xa5a5a5a5U, 0x38383838U,
     0xbfbfbfbfU, 0x40404040U, 0xa3a3a3a3U, 0x9e9e9e9eU,
@@ -757,10 +757,10 @@ int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
 		while (1) {
 			temp  = rk[3];
 			rk[4] = rk[0] ^
-				(Te4[(temp >> 16) & 0xff] & 0xff000000) ^
-				(Te4[(temp >>  8) & 0xff] & 0x00ff0000) ^
-				(Te4[(temp      ) & 0xff] & 0x0000ff00) ^
-				(Te4[(temp >> 24)       ] & 0x000000ff) ^
+                                (AES_Te4[(temp >> 16) & 0xff] & 0xff000000) ^
+                                (AES_Te4[(temp >>  8) & 0xff] & 0x00ff0000) ^
+                                (AES_Te4[(temp      ) & 0xff] & 0x0000ff00) ^
+                                (AES_Te4[(temp >> 24)       ] & 0x000000ff) ^
 				rcon[i];
 			rk[5] = rk[1] ^ rk[4];
 			rk[6] = rk[2] ^ rk[5];
@@ -777,10 +777,10 @@ int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
 		while (1) {
 			temp = rk[ 5];
 			rk[ 6] = rk[ 0] ^
-				(Te4[(temp >> 16) & 0xff] & 0xff000000) ^
-				(Te4[(temp >>  8) & 0xff] & 0x00ff0000) ^
-				(Te4[(temp      ) & 0xff] & 0x0000ff00) ^
-				(Te4[(temp >> 24)       ] & 0x000000ff) ^
+                                (AES_Te4[(temp >> 16) & 0xff] & 0xff000000) ^
+                                (AES_Te4[(temp >>  8) & 0xff] & 0x00ff0000) ^
+                                (AES_Te4[(temp      ) & 0xff] & 0x0000ff00) ^
+                                (AES_Te4[(temp >> 24)       ] & 0x000000ff) ^
 				rcon[i];
 			rk[ 7] = rk[ 1] ^ rk[ 6];
 			rk[ 8] = rk[ 2] ^ rk[ 7];
@@ -799,10 +799,10 @@ int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
 		while (1) {
 			temp = rk[ 7];
 			rk[ 8] = rk[ 0] ^
-				(Te4[(temp >> 16) & 0xff] & 0xff000000) ^
-				(Te4[(temp >>  8) & 0xff] & 0x00ff0000) ^
-				(Te4[(temp      ) & 0xff] & 0x0000ff00) ^
-				(Te4[(temp >> 24)       ] & 0x000000ff) ^
+                                (AES_Te4[(temp >> 16) & 0xff] & 0xff000000) ^
+                                (AES_Te4[(temp >>  8) & 0xff] & 0x00ff0000) ^
+                                (AES_Te4[(temp      ) & 0xff] & 0x0000ff00) ^
+                                (AES_Te4[(temp >> 24)       ] & 0x000000ff) ^
 				rcon[i];
 			rk[ 9] = rk[ 1] ^ rk[ 8];
 			rk[10] = rk[ 2] ^ rk[ 9];
@@ -812,10 +812,10 @@ int AES_set_encrypt_key(const unsigned char *userKey, const int bits,
 			}
 			temp = rk[11];
 			rk[12] = rk[ 4] ^
-				(Te4[(temp >> 24)       ] & 0xff000000) ^
-				(Te4[(temp >> 16) & 0xff] & 0x00ff0000) ^
-				(Te4[(temp >>  8) & 0xff] & 0x0000ff00) ^
-				(Te4[(temp      ) & 0xff] & 0x000000ff);
+                                (AES_Te4[(temp >> 24)       ] & 0xff000000) ^
+                                (AES_Te4[(temp >> 16) & 0xff] & 0x00ff0000) ^
+                                (AES_Te4[(temp >>  8) & 0xff] & 0x0000ff00) ^
+                                (AES_Te4[(temp      ) & 0xff] & 0x000000ff);
 			rk[13] = rk[ 5] ^ rk[12];
 			rk[14] = rk[ 6] ^ rk[13];
 			rk[15] = rk[ 7] ^ rk[14];
@@ -854,25 +854,25 @@ int AES_set_decrypt_key(const unsigned char *userKey, const int bits,
 	for (i = 1; i < (key->rounds); i++) {
 		rk += 4;
 		rk[0] =
-			Td0[Te4[(rk[0] >> 24)       ] & 0xff] ^
-			Td1[Te4[(rk[0] >> 16) & 0xff] & 0xff] ^
-			Td2[Te4[(rk[0] >>  8) & 0xff] & 0xff] ^
-			Td3[Te4[(rk[0]      ) & 0xff] & 0xff];
+                        AES_Td0[AES_Te4[(rk[0] >> 24)       ] & 0xff] ^
+                        AES_Td1[AES_Te4[(rk[0] >> 16) & 0xff] & 0xff] ^
+                        AES_Td2[AES_Te4[(rk[0] >>  8) & 0xff] & 0xff] ^
+                        AES_Td3[AES_Te4[(rk[0]      ) & 0xff] & 0xff];
 		rk[1] =
-			Td0[Te4[(rk[1] >> 24)       ] & 0xff] ^
-			Td1[Te4[(rk[1] >> 16) & 0xff] & 0xff] ^
-			Td2[Te4[(rk[1] >>  8) & 0xff] & 0xff] ^
-			Td3[Te4[(rk[1]      ) & 0xff] & 0xff];
+                        AES_Td0[AES_Te4[(rk[1] >> 24)       ] & 0xff] ^
+                        AES_Td1[AES_Te4[(rk[1] >> 16) & 0xff] & 0xff] ^
+                        AES_Td2[AES_Te4[(rk[1] >>  8) & 0xff] & 0xff] ^
+                        AES_Td3[AES_Te4[(rk[1]      ) & 0xff] & 0xff];
 		rk[2] =
-			Td0[Te4[(rk[2] >> 24)       ] & 0xff] ^
-			Td1[Te4[(rk[2] >> 16) & 0xff] & 0xff] ^
-			Td2[Te4[(rk[2] >>  8) & 0xff] & 0xff] ^
-			Td3[Te4[(rk[2]      ) & 0xff] & 0xff];
+                        AES_Td0[AES_Te4[(rk[2] >> 24)       ] & 0xff] ^
+                        AES_Td1[AES_Te4[(rk[2] >> 16) & 0xff] & 0xff] ^
+                        AES_Td2[AES_Te4[(rk[2] >>  8) & 0xff] & 0xff] ^
+                        AES_Td3[AES_Te4[(rk[2]      ) & 0xff] & 0xff];
 		rk[3] =
-			Td0[Te4[(rk[3] >> 24)       ] & 0xff] ^
-			Td1[Te4[(rk[3] >> 16) & 0xff] & 0xff] ^
-			Td2[Te4[(rk[3] >>  8) & 0xff] & 0xff] ^
-			Td3[Te4[(rk[3]      ) & 0xff] & 0xff];
+                        AES_Td0[AES_Te4[(rk[3] >> 24)       ] & 0xff] ^
+                        AES_Td1[AES_Te4[(rk[3] >> 16) & 0xff] & 0xff] ^
+                        AES_Td2[AES_Te4[(rk[3] >>  8) & 0xff] & 0xff] ^
+                        AES_Td3[AES_Te4[(rk[3]      ) & 0xff] & 0xff];
 	}
 	return 0;
 }
@@ -904,72 +904,72 @@ void AES_encrypt(const unsigned char *in, unsigned char *out,
 	s3 = GETU32(in + 12) ^ rk[3];
 #ifdef FULL_UNROLL
 	/* round 1: */
-   	t0 = Te0[s0 >> 24] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >>  8) & 0xff] ^ Te3[s3 & 0xff] ^ rk[ 4];
-   	t1 = Te0[s1 >> 24] ^ Te1[(s2 >> 16) & 0xff] ^ Te2[(s3 >>  8) & 0xff] ^ Te3[s0 & 0xff] ^ rk[ 5];
-   	t2 = Te0[s2 >> 24] ^ Te1[(s3 >> 16) & 0xff] ^ Te2[(s0 >>  8) & 0xff] ^ Te3[s1 & 0xff] ^ rk[ 6];
-   	t3 = Te0[s3 >> 24] ^ Te1[(s0 >> 16) & 0xff] ^ Te2[(s1 >>  8) & 0xff] ^ Te3[s2 & 0xff] ^ rk[ 7];
+        t0 = AES_Te0[s0 >> 24] ^ AES_Te1[(s1 >> 16) & 0xff] ^ AES_Te2[(s2 >>  8) & 0xff] ^ AES_Te3[s3 & 0xff] ^ rk[ 4];
+        t1 = AES_Te0[s1 >> 24] ^ AES_Te1[(s2 >> 16) & 0xff] ^ AES_Te2[(s3 >>  8) & 0xff] ^ AES_Te3[s0 & 0xff] ^ rk[ 5];
+        t2 = AES_Te0[s2 >> 24] ^ AES_Te1[(s3 >> 16) & 0xff] ^ AES_Te2[(s0 >>  8) & 0xff] ^ AES_Te3[s1 & 0xff] ^ rk[ 6];
+        t3 = AES_Te0[s3 >> 24] ^ AES_Te1[(s0 >> 16) & 0xff] ^ AES_Te2[(s1 >>  8) & 0xff] ^ AES_Te3[s2 & 0xff] ^ rk[ 7];
    	/* round 2: */
-   	s0 = Te0[t0 >> 24] ^ Te1[(t1 >> 16) & 0xff] ^ Te2[(t2 >>  8) & 0xff] ^ Te3[t3 & 0xff] ^ rk[ 8];
-   	s1 = Te0[t1 >> 24] ^ Te1[(t2 >> 16) & 0xff] ^ Te2[(t3 >>  8) & 0xff] ^ Te3[t0 & 0xff] ^ rk[ 9];
-   	s2 = Te0[t2 >> 24] ^ Te1[(t3 >> 16) & 0xff] ^ Te2[(t0 >>  8) & 0xff] ^ Te3[t1 & 0xff] ^ rk[10];
-   	s3 = Te0[t3 >> 24] ^ Te1[(t0 >> 16) & 0xff] ^ Te2[(t1 >>  8) & 0xff] ^ Te3[t2 & 0xff] ^ rk[11];
+        s0 = AES_Te0[t0 >> 24] ^ AES_Te1[(t1 >> 16) & 0xff] ^ AES_Te2[(t2 >>  8) & 0xff] ^ AES_Te3[t3 & 0xff] ^ rk[ 8];
+        s1 = AES_Te0[t1 >> 24] ^ AES_Te1[(t2 >> 16) & 0xff] ^ AES_Te2[(t3 >>  8) & 0xff] ^ AES_Te3[t0 & 0xff] ^ rk[ 9];
+        s2 = AES_Te0[t2 >> 24] ^ AES_Te1[(t3 >> 16) & 0xff] ^ AES_Te2[(t0 >>  8) & 0xff] ^ AES_Te3[t1 & 0xff] ^ rk[10];
+        s3 = AES_Te0[t3 >> 24] ^ AES_Te1[(t0 >> 16) & 0xff] ^ AES_Te2[(t1 >>  8) & 0xff] ^ AES_Te3[t2 & 0xff] ^ rk[11];
 	/* round 3: */
-   	t0 = Te0[s0 >> 24] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >>  8) & 0xff] ^ Te3[s3 & 0xff] ^ rk[12];
-   	t1 = Te0[s1 >> 24] ^ Te1[(s2 >> 16) & 0xff] ^ Te2[(s3 >>  8) & 0xff] ^ Te3[s0 & 0xff] ^ rk[13];
-   	t2 = Te0[s2 >> 24] ^ Te1[(s3 >> 16) & 0xff] ^ Te2[(s0 >>  8) & 0xff] ^ Te3[s1 & 0xff] ^ rk[14];
-   	t3 = Te0[s3 >> 24] ^ Te1[(s0 >> 16) & 0xff] ^ Te2[(s1 >>  8) & 0xff] ^ Te3[s2 & 0xff] ^ rk[15];
+        t0 = AES_Te0[s0 >> 24] ^ AES_Te1[(s1 >> 16) & 0xff] ^ AES_Te2[(s2 >>  8) & 0xff] ^ AES_Te3[s3 & 0xff] ^ rk[12];
+        t1 = AES_Te0[s1 >> 24] ^ AES_Te1[(s2 >> 16) & 0xff] ^ AES_Te2[(s3 >>  8) & 0xff] ^ AES_Te3[s0 & 0xff] ^ rk[13];
+        t2 = AES_Te0[s2 >> 24] ^ AES_Te1[(s3 >> 16) & 0xff] ^ AES_Te2[(s0 >>  8) & 0xff] ^ AES_Te3[s1 & 0xff] ^ rk[14];
+        t3 = AES_Te0[s3 >> 24] ^ AES_Te1[(s0 >> 16) & 0xff] ^ AES_Te2[(s1 >>  8) & 0xff] ^ AES_Te3[s2 & 0xff] ^ rk[15];
    	/* round 4: */
-   	s0 = Te0[t0 >> 24] ^ Te1[(t1 >> 16) & 0xff] ^ Te2[(t2 >>  8) & 0xff] ^ Te3[t3 & 0xff] ^ rk[16];
-   	s1 = Te0[t1 >> 24] ^ Te1[(t2 >> 16) & 0xff] ^ Te2[(t3 >>  8) & 0xff] ^ Te3[t0 & 0xff] ^ rk[17];
-   	s2 = Te0[t2 >> 24] ^ Te1[(t3 >> 16) & 0xff] ^ Te2[(t0 >>  8) & 0xff] ^ Te3[t1 & 0xff] ^ rk[18];
-   	s3 = Te0[t3 >> 24] ^ Te1[(t0 >> 16) & 0xff] ^ Te2[(t1 >>  8) & 0xff] ^ Te3[t2 & 0xff] ^ rk[19];
+        s0 = AES_Te0[t0 >> 24] ^ AES_Te1[(t1 >> 16) & 0xff] ^ AES_Te2[(t2 >>  8) & 0xff] ^ AES_Te3[t3 & 0xff] ^ rk[16];
+        s1 = AES_Te0[t1 >> 24] ^ AES_Te1[(t2 >> 16) & 0xff] ^ AES_Te2[(t3 >>  8) & 0xff] ^ AES_Te3[t0 & 0xff] ^ rk[17];
+        s2 = AES_Te0[t2 >> 24] ^ AES_Te1[(t3 >> 16) & 0xff] ^ AES_Te2[(t0 >>  8) & 0xff] ^ AES_Te3[t1 & 0xff] ^ rk[18];
+        s3 = AES_Te0[t3 >> 24] ^ AES_Te1[(t0 >> 16) & 0xff] ^ AES_Te2[(t1 >>  8) & 0xff] ^ AES_Te3[t2 & 0xff] ^ rk[19];
 	/* round 5: */
-   	t0 = Te0[s0 >> 24] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >>  8) & 0xff] ^ Te3[s3 & 0xff] ^ rk[20];
-   	t1 = Te0[s1 >> 24] ^ Te1[(s2 >> 16) & 0xff] ^ Te2[(s3 >>  8) & 0xff] ^ Te3[s0 & 0xff] ^ rk[21];
-   	t2 = Te0[s2 >> 24] ^ Te1[(s3 >> 16) & 0xff] ^ Te2[(s0 >>  8) & 0xff] ^ Te3[s1 & 0xff] ^ rk[22];
-   	t3 = Te0[s3 >> 24] ^ Te1[(s0 >> 16) & 0xff] ^ Te2[(s1 >>  8) & 0xff] ^ Te3[s2 & 0xff] ^ rk[23];
+        t0 = AES_Te0[s0 >> 24] ^ AES_Te1[(s1 >> 16) & 0xff] ^ AES_Te2[(s2 >>  8) & 0xff] ^ AES_Te3[s3 & 0xff] ^ rk[20];
+        t1 = AES_Te0[s1 >> 24] ^ AES_Te1[(s2 >> 16) & 0xff] ^ AES_Te2[(s3 >>  8) & 0xff] ^ AES_Te3[s0 & 0xff] ^ rk[21];
+        t2 = AES_Te0[s2 >> 24] ^ AES_Te1[(s3 >> 16) & 0xff] ^ AES_Te2[(s0 >>  8) & 0xff] ^ AES_Te3[s1 & 0xff] ^ rk[22];
+        t3 = AES_Te0[s3 >> 24] ^ AES_Te1[(s0 >> 16) & 0xff] ^ AES_Te2[(s1 >>  8) & 0xff] ^ AES_Te3[s2 & 0xff] ^ rk[23];
    	/* round 6: */
-   	s0 = Te0[t0 >> 24] ^ Te1[(t1 >> 16) & 0xff] ^ Te2[(t2 >>  8) & 0xff] ^ Te3[t3 & 0xff] ^ rk[24];
-   	s1 = Te0[t1 >> 24] ^ Te1[(t2 >> 16) & 0xff] ^ Te2[(t3 >>  8) & 0xff] ^ Te3[t0 & 0xff] ^ rk[25];
-   	s2 = Te0[t2 >> 24] ^ Te1[(t3 >> 16) & 0xff] ^ Te2[(t0 >>  8) & 0xff] ^ Te3[t1 & 0xff] ^ rk[26];
-   	s3 = Te0[t3 >> 24] ^ Te1[(t0 >> 16) & 0xff] ^ Te2[(t1 >>  8) & 0xff] ^ Te3[t2 & 0xff] ^ rk[27];
+        s0 = AES_Te0[t0 >> 24] ^ AES_Te1[(t1 >> 16) & 0xff] ^ AES_Te2[(t2 >>  8) & 0xff] ^ AES_Te3[t3 & 0xff] ^ rk[24];
+        s1 = AES_Te0[t1 >> 24] ^ AES_Te1[(t2 >> 16) & 0xff] ^ AES_Te2[(t3 >>  8) & 0xff] ^ AES_Te3[t0 & 0xff] ^ rk[25];
+        s2 = AES_Te0[t2 >> 24] ^ AES_Te1[(t3 >> 16) & 0xff] ^ AES_Te2[(t0 >>  8) & 0xff] ^ AES_Te3[t1 & 0xff] ^ rk[26];
+        s3 = AES_Te0[t3 >> 24] ^ AES_Te1[(t0 >> 16) & 0xff] ^ AES_Te2[(t1 >>  8) & 0xff] ^ AES_Te3[t2 & 0xff] ^ rk[27];
 	/* round 7: */
-   	t0 = Te0[s0 >> 24] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >>  8) & 0xff] ^ Te3[s3 & 0xff] ^ rk[28];
-   	t1 = Te0[s1 >> 24] ^ Te1[(s2 >> 16) & 0xff] ^ Te2[(s3 >>  8) & 0xff] ^ Te3[s0 & 0xff] ^ rk[29];
-   	t2 = Te0[s2 >> 24] ^ Te1[(s3 >> 16) & 0xff] ^ Te2[(s0 >>  8) & 0xff] ^ Te3[s1 & 0xff] ^ rk[30];
-   	t3 = Te0[s3 >> 24] ^ Te1[(s0 >> 16) & 0xff] ^ Te2[(s1 >>  8) & 0xff] ^ Te3[s2 & 0xff] ^ rk[31];
+        t0 = AES_Te0[s0 >> 24] ^ AES_Te1[(s1 >> 16) & 0xff] ^ AES_Te2[(s2 >>  8) & 0xff] ^ AES_Te3[s3 & 0xff] ^ rk[28];
+        t1 = AES_Te0[s1 >> 24] ^ AES_Te1[(s2 >> 16) & 0xff] ^ AES_Te2[(s3 >>  8) & 0xff] ^ AES_Te3[s0 & 0xff] ^ rk[29];
+        t2 = AES_Te0[s2 >> 24] ^ AES_Te1[(s3 >> 16) & 0xff] ^ AES_Te2[(s0 >>  8) & 0xff] ^ AES_Te3[s1 & 0xff] ^ rk[30];
+        t3 = AES_Te0[s3 >> 24] ^ AES_Te1[(s0 >> 16) & 0xff] ^ AES_Te2[(s1 >>  8) & 0xff] ^ AES_Te3[s2 & 0xff] ^ rk[31];
    	/* round 8: */
-   	s0 = Te0[t0 >> 24] ^ Te1[(t1 >> 16) & 0xff] ^ Te2[(t2 >>  8) & 0xff] ^ Te3[t3 & 0xff] ^ rk[32];
-   	s1 = Te0[t1 >> 24] ^ Te1[(t2 >> 16) & 0xff] ^ Te2[(t3 >>  8) & 0xff] ^ Te3[t0 & 0xff] ^ rk[33];
-   	s2 = Te0[t2 >> 24] ^ Te1[(t3 >> 16) & 0xff] ^ Te2[(t0 >>  8) & 0xff] ^ Te3[t1 & 0xff] ^ rk[34];
-   	s3 = Te0[t3 >> 24] ^ Te1[(t0 >> 16) & 0xff] ^ Te2[(t1 >>  8) & 0xff] ^ Te3[t2 & 0xff] ^ rk[35];
+        s0 = AES_Te0[t0 >> 24] ^ AES_Te1[(t1 >> 16) & 0xff] ^ AES_Te2[(t2 >>  8) & 0xff] ^ AES_Te3[t3 & 0xff] ^ rk[32];
+        s1 = AES_Te0[t1 >> 24] ^ AES_Te1[(t2 >> 16) & 0xff] ^ AES_Te2[(t3 >>  8) & 0xff] ^ AES_Te3[t0 & 0xff] ^ rk[33];
+        s2 = AES_Te0[t2 >> 24] ^ AES_Te1[(t3 >> 16) & 0xff] ^ AES_Te2[(t0 >>  8) & 0xff] ^ AES_Te3[t1 & 0xff] ^ rk[34];
+        s3 = AES_Te0[t3 >> 24] ^ AES_Te1[(t0 >> 16) & 0xff] ^ AES_Te2[(t1 >>  8) & 0xff] ^ AES_Te3[t2 & 0xff] ^ rk[35];
 	/* round 9: */
-   	t0 = Te0[s0 >> 24] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >>  8) & 0xff] ^ Te3[s3 & 0xff] ^ rk[36];
-   	t1 = Te0[s1 >> 24] ^ Te1[(s2 >> 16) & 0xff] ^ Te2[(s3 >>  8) & 0xff] ^ Te3[s0 & 0xff] ^ rk[37];
-   	t2 = Te0[s2 >> 24] ^ Te1[(s3 >> 16) & 0xff] ^ Te2[(s0 >>  8) & 0xff] ^ Te3[s1 & 0xff] ^ rk[38];
-   	t3 = Te0[s3 >> 24] ^ Te1[(s0 >> 16) & 0xff] ^ Te2[(s1 >>  8) & 0xff] ^ Te3[s2 & 0xff] ^ rk[39];
+        t0 = AES_Te0[s0 >> 24] ^ AES_Te1[(s1 >> 16) & 0xff] ^ AES_Te2[(s2 >>  8) & 0xff] ^ AES_Te3[s3 & 0xff] ^ rk[36];
+        t1 = AES_Te0[s1 >> 24] ^ AES_Te1[(s2 >> 16) & 0xff] ^ AES_Te2[(s3 >>  8) & 0xff] ^ AES_Te3[s0 & 0xff] ^ rk[37];
+        t2 = AES_Te0[s2 >> 24] ^ AES_Te1[(s3 >> 16) & 0xff] ^ AES_Te2[(s0 >>  8) & 0xff] ^ AES_Te3[s1 & 0xff] ^ rk[38];
+        t3 = AES_Te0[s3 >> 24] ^ AES_Te1[(s0 >> 16) & 0xff] ^ AES_Te2[(s1 >>  8) & 0xff] ^ AES_Te3[s2 & 0xff] ^ rk[39];
     if (key->rounds > 10) {
         /* round 10: */
-        s0 = Te0[t0 >> 24] ^ Te1[(t1 >> 16) & 0xff] ^ Te2[(t2 >>  8) & 0xff] ^ Te3[t3 & 0xff] ^ rk[40];
-        s1 = Te0[t1 >> 24] ^ Te1[(t2 >> 16) & 0xff] ^ Te2[(t3 >>  8) & 0xff] ^ Te3[t0 & 0xff] ^ rk[41];
-        s2 = Te0[t2 >> 24] ^ Te1[(t3 >> 16) & 0xff] ^ Te2[(t0 >>  8) & 0xff] ^ Te3[t1 & 0xff] ^ rk[42];
-        s3 = Te0[t3 >> 24] ^ Te1[(t0 >> 16) & 0xff] ^ Te2[(t1 >>  8) & 0xff] ^ Te3[t2 & 0xff] ^ rk[43];
+        s0 = AES_Te0[t0 >> 24] ^ AES_Te1[(t1 >> 16) & 0xff] ^ AES_Te2[(t2 >>  8) & 0xff] ^ AES_Te3[t3 & 0xff] ^ rk[40];
+        s1 = AES_Te0[t1 >> 24] ^ AES_Te1[(t2 >> 16) & 0xff] ^ AES_Te2[(t3 >>  8) & 0xff] ^ AES_Te3[t0 & 0xff] ^ rk[41];
+        s2 = AES_Te0[t2 >> 24] ^ AES_Te1[(t3 >> 16) & 0xff] ^ AES_Te2[(t0 >>  8) & 0xff] ^ AES_Te3[t1 & 0xff] ^ rk[42];
+        s3 = AES_Te0[t3 >> 24] ^ AES_Te1[(t0 >> 16) & 0xff] ^ AES_Te2[(t1 >>  8) & 0xff] ^ AES_Te3[t2 & 0xff] ^ rk[43];
         /* round 11: */
-        t0 = Te0[s0 >> 24] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >>  8) & 0xff] ^ Te3[s3 & 0xff] ^ rk[44];
-        t1 = Te0[s1 >> 24] ^ Te1[(s2 >> 16) & 0xff] ^ Te2[(s3 >>  8) & 0xff] ^ Te3[s0 & 0xff] ^ rk[45];
-        t2 = Te0[s2 >> 24] ^ Te1[(s3 >> 16) & 0xff] ^ Te2[(s0 >>  8) & 0xff] ^ Te3[s1 & 0xff] ^ rk[46];
-        t3 = Te0[s3 >> 24] ^ Te1[(s0 >> 16) & 0xff] ^ Te2[(s1 >>  8) & 0xff] ^ Te3[s2 & 0xff] ^ rk[47];
+        t0 = AES_Te0[s0 >> 24] ^ AES_Te1[(s1 >> 16) & 0xff] ^ AES_Te2[(s2 >>  8) & 0xff] ^ AES_Te3[s3 & 0xff] ^ rk[44];
+        t1 = AES_Te0[s1 >> 24] ^ AES_Te1[(s2 >> 16) & 0xff] ^ AES_Te2[(s3 >>  8) & 0xff] ^ AES_Te3[s0 & 0xff] ^ rk[45];
+        t2 = AES_Te0[s2 >> 24] ^ AES_Te1[(s3 >> 16) & 0xff] ^ AES_Te2[(s0 >>  8) & 0xff] ^ AES_Te3[s1 & 0xff] ^ rk[46];
+        t3 = AES_Te0[s3 >> 24] ^ AES_Te1[(s0 >> 16) & 0xff] ^ AES_Te2[(s1 >>  8) & 0xff] ^ AES_Te3[s2 & 0xff] ^ rk[47];
         if (key->rounds > 12) {
             /* round 12: */
-            s0 = Te0[t0 >> 24] ^ Te1[(t1 >> 16) & 0xff] ^ Te2[(t2 >>  8) & 0xff] ^ Te3[t3 & 0xff] ^ rk[48];
-            s1 = Te0[t1 >> 24] ^ Te1[(t2 >> 16) & 0xff] ^ Te2[(t3 >>  8) & 0xff] ^ Te3[t0 & 0xff] ^ rk[49];
-            s2 = Te0[t2 >> 24] ^ Te1[(t3 >> 16) & 0xff] ^ Te2[(t0 >>  8) & 0xff] ^ Te3[t1 & 0xff] ^ rk[50];
-            s3 = Te0[t3 >> 24] ^ Te1[(t0 >> 16) & 0xff] ^ Te2[(t1 >>  8) & 0xff] ^ Te3[t2 & 0xff] ^ rk[51];
+            s0 = AES_Te0[t0 >> 24] ^ AES_Te1[(t1 >> 16) & 0xff] ^ AES_Te2[(t2 >>  8) & 0xff] ^ AES_Te3[t3 & 0xff] ^ rk[48];
+            s1 = AES_Te0[t1 >> 24] ^ AES_Te1[(t2 >> 16) & 0xff] ^ AES_Te2[(t3 >>  8) & 0xff] ^ AES_Te3[t0 & 0xff] ^ rk[49];
+            s2 = AES_Te0[t2 >> 24] ^ AES_Te1[(t3 >> 16) & 0xff] ^ AES_Te2[(t0 >>  8) & 0xff] ^ AES_Te3[t1 & 0xff] ^ rk[50];
+            s3 = AES_Te0[t3 >> 24] ^ AES_Te1[(t0 >> 16) & 0xff] ^ AES_Te2[(t1 >>  8) & 0xff] ^ AES_Te3[t2 & 0xff] ^ rk[51];
             /* round 13: */
-            t0 = Te0[s0 >> 24] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >>  8) & 0xff] ^ Te3[s3 & 0xff] ^ rk[52];
-            t1 = Te0[s1 >> 24] ^ Te1[(s2 >> 16) & 0xff] ^ Te2[(s3 >>  8) & 0xff] ^ Te3[s0 & 0xff] ^ rk[53];
-            t2 = Te0[s2 >> 24] ^ Te1[(s3 >> 16) & 0xff] ^ Te2[(s0 >>  8) & 0xff] ^ Te3[s1 & 0xff] ^ rk[54];
-            t3 = Te0[s3 >> 24] ^ Te1[(s0 >> 16) & 0xff] ^ Te2[(s1 >>  8) & 0xff] ^ Te3[s2 & 0xff] ^ rk[55];
+            t0 = AES_Te0[s0 >> 24] ^ AES_Te1[(s1 >> 16) & 0xff] ^ AES_Te2[(s2 >>  8) & 0xff] ^ AES_Te3[s3 & 0xff] ^ rk[52];
+            t1 = AES_Te0[s1 >> 24] ^ AES_Te1[(s2 >> 16) & 0xff] ^ AES_Te2[(s3 >>  8) & 0xff] ^ AES_Te3[s0 & 0xff] ^ rk[53];
+            t2 = AES_Te0[s2 >> 24] ^ AES_Te1[(s3 >> 16) & 0xff] ^ AES_Te2[(s0 >>  8) & 0xff] ^ AES_Te3[s1 & 0xff] ^ rk[54];
+            t3 = AES_Te0[s3 >> 24] ^ AES_Te1[(s0 >> 16) & 0xff] ^ AES_Te2[(s1 >>  8) & 0xff] ^ AES_Te3[s2 & 0xff] ^ rk[55];
         }
     }
     rk += key->rounds << 2;
@@ -980,28 +980,28 @@ void AES_encrypt(const unsigned char *in, unsigned char *out,
     r = key->rounds >> 1;
     for (;;) {
         t0 =
-            Te0[(s0 >> 24)       ] ^
-            Te1[(s1 >> 16) & 0xff] ^
-            Te2[(s2 >>  8) & 0xff] ^
-            Te3[(s3      ) & 0xff] ^
+            AES_Te0[(s0 >> 24)       ] ^
+            AES_Te1[(s1 >> 16) & 0xff] ^
+            AES_Te2[(s2 >>  8) & 0xff] ^
+            AES_Te3[(s3      ) & 0xff] ^
             rk[4];
         t1 =
-            Te0[(s1 >> 24)       ] ^
-            Te1[(s2 >> 16) & 0xff] ^
-            Te2[(s3 >>  8) & 0xff] ^
-            Te3[(s0      ) & 0xff] ^
+            AES_Te0[(s1 >> 24)       ] ^
+            AES_Te1[(s2 >> 16) & 0xff] ^
+            AES_Te2[(s3 >>  8) & 0xff] ^
+            AES_Te3[(s0      ) & 0xff] ^
             rk[5];
         t2 =
-            Te0[(s2 >> 24)       ] ^
-            Te1[(s3 >> 16) & 0xff] ^
-            Te2[(s0 >>  8) & 0xff] ^
-            Te3[(s1      ) & 0xff] ^
+            AES_Te0[(s2 >> 24)       ] ^
+            AES_Te1[(s3 >> 16) & 0xff] ^
+            AES_Te2[(s0 >>  8) & 0xff] ^
+            AES_Te3[(s1      ) & 0xff] ^
             rk[6];
         t3 =
-            Te0[(s3 >> 24)       ] ^
-            Te1[(s0 >> 16) & 0xff] ^
-            Te2[(s1 >>  8) & 0xff] ^
-            Te3[(s2      ) & 0xff] ^
+            AES_Te0[(s3 >> 24)       ] ^
+            AES_Te1[(s0 >> 16) & 0xff] ^
+            AES_Te2[(s1 >>  8) & 0xff] ^
+            AES_Te3[(s2      ) & 0xff] ^
             rk[7];
 
         rk += 8;
@@ -1010,28 +1010,28 @@ void AES_encrypt(const unsigned char *in, unsigned char *out,
         }
 
         s0 =
-            Te0[(t0 >> 24)       ] ^
-            Te1[(t1 >> 16) & 0xff] ^
-            Te2[(t2 >>  8) & 0xff] ^
-            Te3[(t3      ) & 0xff] ^
+            AES_Te0[(t0 >> 24)       ] ^
+            AES_Te1[(t1 >> 16) & 0xff] ^
+            AES_Te2[(t2 >>  8) & 0xff] ^
+            AES_Te3[(t3      ) & 0xff] ^
             rk[0];
         s1 =
-            Te0[(t1 >> 24)       ] ^
-            Te1[(t2 >> 16) & 0xff] ^
-            Te2[(t3 >>  8) & 0xff] ^
-            Te3[(t0      ) & 0xff] ^
+            AES_Te0[(t1 >> 24)       ] ^
+            AES_Te1[(t2 >> 16) & 0xff] ^
+            AES_Te2[(t3 >>  8) & 0xff] ^
+            AES_Te3[(t0      ) & 0xff] ^
             rk[1];
         s2 =
-            Te0[(t2 >> 24)       ] ^
-            Te1[(t3 >> 16) & 0xff] ^
-            Te2[(t0 >>  8) & 0xff] ^
-            Te3[(t1      ) & 0xff] ^
+            AES_Te0[(t2 >> 24)       ] ^
+            AES_Te1[(t3 >> 16) & 0xff] ^
+            AES_Te2[(t0 >>  8) & 0xff] ^
+            AES_Te3[(t1      ) & 0xff] ^
             rk[2];
         s3 =
-            Te0[(t3 >> 24)       ] ^
-            Te1[(t0 >> 16) & 0xff] ^
-            Te2[(t1 >>  8) & 0xff] ^
-            Te3[(t2      ) & 0xff] ^
+            AES_Te0[(t3 >> 24)       ] ^
+            AES_Te1[(t0 >> 16) & 0xff] ^
+            AES_Te2[(t1 >>  8) & 0xff] ^
+            AES_Te3[(t2      ) & 0xff] ^
             rk[3];
     }
 #endif /* ?FULL_UNROLL */
@@ -1040,31 +1040,31 @@ void AES_encrypt(const unsigned char *in, unsigned char *out,
 	 * map cipher state to byte array block:
 	 */
 	s0 =
-		(Te4[(t0 >> 24)       ] & 0xff000000) ^
-		(Te4[(t1 >> 16) & 0xff] & 0x00ff0000) ^
-		(Te4[(t2 >>  8) & 0xff] & 0x0000ff00) ^
-		(Te4[(t3      ) & 0xff] & 0x000000ff) ^
+                (AES_Te4[(t0 >> 24)       ] & 0xff000000) ^
+                (AES_Te4[(t1 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Te4[(t2 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Te4[(t3      ) & 0xff] & 0x000000ff) ^
 		rk[0];
 	PUTU32(out     , s0);
 	s1 =
-		(Te4[(t1 >> 24)       ] & 0xff000000) ^
-		(Te4[(t2 >> 16) & 0xff] & 0x00ff0000) ^
-		(Te4[(t3 >>  8) & 0xff] & 0x0000ff00) ^
-		(Te4[(t0      ) & 0xff] & 0x000000ff) ^
+                (AES_Te4[(t1 >> 24)       ] & 0xff000000) ^
+                (AES_Te4[(t2 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Te4[(t3 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Te4[(t0      ) & 0xff] & 0x000000ff) ^
 		rk[1];
 	PUTU32(out +  4, s1);
 	s2 =
-		(Te4[(t2 >> 24)       ] & 0xff000000) ^
-		(Te4[(t3 >> 16) & 0xff] & 0x00ff0000) ^
-		(Te4[(t0 >>  8) & 0xff] & 0x0000ff00) ^
-		(Te4[(t1      ) & 0xff] & 0x000000ff) ^
+                (AES_Te4[(t2 >> 24)       ] & 0xff000000) ^
+                (AES_Te4[(t3 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Te4[(t0 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Te4[(t1      ) & 0xff] & 0x000000ff) ^
 		rk[2];
 	PUTU32(out +  8, s2);
 	s3 =
-		(Te4[(t3 >> 24)       ] & 0xff000000) ^
-		(Te4[(t0 >> 16) & 0xff] & 0x00ff0000) ^
-		(Te4[(t1 >>  8) & 0xff] & 0x0000ff00) ^
-		(Te4[(t2      ) & 0xff] & 0x000000ff) ^
+                (AES_Te4[(t3 >> 24)       ] & 0xff000000) ^
+                (AES_Te4[(t0 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Te4[(t1 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Te4[(t2      ) & 0xff] & 0x000000ff) ^
 		rk[3];
 	PUTU32(out + 12, s3);
 }
@@ -1095,72 +1095,72 @@ void AES_decrypt(const unsigned char *in, unsigned char *out,
     s3 = GETU32(in + 12) ^ rk[3];
 #ifdef FULL_UNROLL
     /* round 1: */
-    t0 = Td0[s0 >> 24] ^ Td1[(s3 >> 16) & 0xff] ^ Td2[(s2 >>  8) & 0xff] ^ Td3[s1 & 0xff] ^ rk[ 4];
-    t1 = Td0[s1 >> 24] ^ Td1[(s0 >> 16) & 0xff] ^ Td2[(s3 >>  8) & 0xff] ^ Td3[s2 & 0xff] ^ rk[ 5];
-    t2 = Td0[s2 >> 24] ^ Td1[(s1 >> 16) & 0xff] ^ Td2[(s0 >>  8) & 0xff] ^ Td3[s3 & 0xff] ^ rk[ 6];
-    t3 = Td0[s3 >> 24] ^ Td1[(s2 >> 16) & 0xff] ^ Td2[(s1 >>  8) & 0xff] ^ Td3[s0 & 0xff] ^ rk[ 7];
+    t0 = AES_Td0[s0 >> 24] ^ AES_Td1[(s3 >> 16) & 0xff] ^ AES_Td2[(s2 >>  8) & 0xff] ^ AES_Td3[s1 & 0xff] ^ rk[ 4];
+    t1 = AES_Td0[s1 >> 24] ^ AES_Td1[(s0 >> 16) & 0xff] ^ AES_Td2[(s3 >>  8) & 0xff] ^ AES_Td3[s2 & 0xff] ^ rk[ 5];
+    t2 = AES_Td0[s2 >> 24] ^ AES_Td1[(s1 >> 16) & 0xff] ^ AES_Td2[(s0 >>  8) & 0xff] ^ AES_Td3[s3 & 0xff] ^ rk[ 6];
+    t3 = AES_Td0[s3 >> 24] ^ AES_Td1[(s2 >> 16) & 0xff] ^ AES_Td2[(s1 >>  8) & 0xff] ^ AES_Td3[s0 & 0xff] ^ rk[ 7];
     /* round 2: */
-    s0 = Td0[t0 >> 24] ^ Td1[(t3 >> 16) & 0xff] ^ Td2[(t2 >>  8) & 0xff] ^ Td3[t1 & 0xff] ^ rk[ 8];
-    s1 = Td0[t1 >> 24] ^ Td1[(t0 >> 16) & 0xff] ^ Td2[(t3 >>  8) & 0xff] ^ Td3[t2 & 0xff] ^ rk[ 9];
-    s2 = Td0[t2 >> 24] ^ Td1[(t1 >> 16) & 0xff] ^ Td2[(t0 >>  8) & 0xff] ^ Td3[t3 & 0xff] ^ rk[10];
-    s3 = Td0[t3 >> 24] ^ Td1[(t2 >> 16) & 0xff] ^ Td2[(t1 >>  8) & 0xff] ^ Td3[t0 & 0xff] ^ rk[11];
+    s0 = AES_Td0[t0 >> 24] ^ AES_Td1[(t3 >> 16) & 0xff] ^ AES_Td2[(t2 >>  8) & 0xff] ^ AES_Td3[t1 & 0xff] ^ rk[ 8];
+    s1 = AES_Td0[t1 >> 24] ^ AES_Td1[(t0 >> 16) & 0xff] ^ AES_Td2[(t3 >>  8) & 0xff] ^ AES_Td3[t2 & 0xff] ^ rk[ 9];
+    s2 = AES_Td0[t2 >> 24] ^ AES_Td1[(t1 >> 16) & 0xff] ^ AES_Td2[(t0 >>  8) & 0xff] ^ AES_Td3[t3 & 0xff] ^ rk[10];
+    s3 = AES_Td0[t3 >> 24] ^ AES_Td1[(t2 >> 16) & 0xff] ^ AES_Td2[(t1 >>  8) & 0xff] ^ AES_Td3[t0 & 0xff] ^ rk[11];
     /* round 3: */
-    t0 = Td0[s0 >> 24] ^ Td1[(s3 >> 16) & 0xff] ^ Td2[(s2 >>  8) & 0xff] ^ Td3[s1 & 0xff] ^ rk[12];
-    t1 = Td0[s1 >> 24] ^ Td1[(s0 >> 16) & 0xff] ^ Td2[(s3 >>  8) & 0xff] ^ Td3[s2 & 0xff] ^ rk[13];
-    t2 = Td0[s2 >> 24] ^ Td1[(s1 >> 16) & 0xff] ^ Td2[(s0 >>  8) & 0xff] ^ Td3[s3 & 0xff] ^ rk[14];
-    t3 = Td0[s3 >> 24] ^ Td1[(s2 >> 16) & 0xff] ^ Td2[(s1 >>  8) & 0xff] ^ Td3[s0 & 0xff] ^ rk[15];
+    t0 = AES_Td0[s0 >> 24] ^ AES_Td1[(s3 >> 16) & 0xff] ^ AES_Td2[(s2 >>  8) & 0xff] ^ AES_Td3[s1 & 0xff] ^ rk[12];
+    t1 = AES_Td0[s1 >> 24] ^ AES_Td1[(s0 >> 16) & 0xff] ^ AES_Td2[(s3 >>  8) & 0xff] ^ AES_Td3[s2 & 0xff] ^ rk[13];
+    t2 = AES_Td0[s2 >> 24] ^ AES_Td1[(s1 >> 16) & 0xff] ^ AES_Td2[(s0 >>  8) & 0xff] ^ AES_Td3[s3 & 0xff] ^ rk[14];
+    t3 = AES_Td0[s3 >> 24] ^ AES_Td1[(s2 >> 16) & 0xff] ^ AES_Td2[(s1 >>  8) & 0xff] ^ AES_Td3[s0 & 0xff] ^ rk[15];
     /* round 4: */
-    s0 = Td0[t0 >> 24] ^ Td1[(t3 >> 16) & 0xff] ^ Td2[(t2 >>  8) & 0xff] ^ Td3[t1 & 0xff] ^ rk[16];
-    s1 = Td0[t1 >> 24] ^ Td1[(t0 >> 16) & 0xff] ^ Td2[(t3 >>  8) & 0xff] ^ Td3[t2 & 0xff] ^ rk[17];
-    s2 = Td0[t2 >> 24] ^ Td1[(t1 >> 16) & 0xff] ^ Td2[(t0 >>  8) & 0xff] ^ Td3[t3 & 0xff] ^ rk[18];
-    s3 = Td0[t3 >> 24] ^ Td1[(t2 >> 16) & 0xff] ^ Td2[(t1 >>  8) & 0xff] ^ Td3[t0 & 0xff] ^ rk[19];
+    s0 = AES_Td0[t0 >> 24] ^ AES_Td1[(t3 >> 16) & 0xff] ^ AES_Td2[(t2 >>  8) & 0xff] ^ AES_Td3[t1 & 0xff] ^ rk[16];
+    s1 = AES_Td0[t1 >> 24] ^ AES_Td1[(t0 >> 16) & 0xff] ^ AES_Td2[(t3 >>  8) & 0xff] ^ AES_Td3[t2 & 0xff] ^ rk[17];
+    s2 = AES_Td0[t2 >> 24] ^ AES_Td1[(t1 >> 16) & 0xff] ^ AES_Td2[(t0 >>  8) & 0xff] ^ AES_Td3[t3 & 0xff] ^ rk[18];
+    s3 = AES_Td0[t3 >> 24] ^ AES_Td1[(t2 >> 16) & 0xff] ^ AES_Td2[(t1 >>  8) & 0xff] ^ AES_Td3[t0 & 0xff] ^ rk[19];
     /* round 5: */
-    t0 = Td0[s0 >> 24] ^ Td1[(s3 >> 16) & 0xff] ^ Td2[(s2 >>  8) & 0xff] ^ Td3[s1 & 0xff] ^ rk[20];
-    t1 = Td0[s1 >> 24] ^ Td1[(s0 >> 16) & 0xff] ^ Td2[(s3 >>  8) & 0xff] ^ Td3[s2 & 0xff] ^ rk[21];
-    t2 = Td0[s2 >> 24] ^ Td1[(s1 >> 16) & 0xff] ^ Td2[(s0 >>  8) & 0xff] ^ Td3[s3 & 0xff] ^ rk[22];
-    t3 = Td0[s3 >> 24] ^ Td1[(s2 >> 16) & 0xff] ^ Td2[(s1 >>  8) & 0xff] ^ Td3[s0 & 0xff] ^ rk[23];
+    t0 = AES_Td0[s0 >> 24] ^ AES_Td1[(s3 >> 16) & 0xff] ^ AES_Td2[(s2 >>  8) & 0xff] ^ AES_Td3[s1 & 0xff] ^ rk[20];
+    t1 = AES_Td0[s1 >> 24] ^ AES_Td1[(s0 >> 16) & 0xff] ^ AES_Td2[(s3 >>  8) & 0xff] ^ AES_Td3[s2 & 0xff] ^ rk[21];
+    t2 = AES_Td0[s2 >> 24] ^ AES_Td1[(s1 >> 16) & 0xff] ^ AES_Td2[(s0 >>  8) & 0xff] ^ AES_Td3[s3 & 0xff] ^ rk[22];
+    t3 = AES_Td0[s3 >> 24] ^ AES_Td1[(s2 >> 16) & 0xff] ^ AES_Td2[(s1 >>  8) & 0xff] ^ AES_Td3[s0 & 0xff] ^ rk[23];
     /* round 6: */
-    s0 = Td0[t0 >> 24] ^ Td1[(t3 >> 16) & 0xff] ^ Td2[(t2 >>  8) & 0xff] ^ Td3[t1 & 0xff] ^ rk[24];
-    s1 = Td0[t1 >> 24] ^ Td1[(t0 >> 16) & 0xff] ^ Td2[(t3 >>  8) & 0xff] ^ Td3[t2 & 0xff] ^ rk[25];
-    s2 = Td0[t2 >> 24] ^ Td1[(t1 >> 16) & 0xff] ^ Td2[(t0 >>  8) & 0xff] ^ Td3[t3 & 0xff] ^ rk[26];
-    s3 = Td0[t3 >> 24] ^ Td1[(t2 >> 16) & 0xff] ^ Td2[(t1 >>  8) & 0xff] ^ Td3[t0 & 0xff] ^ rk[27];
+    s0 = AES_Td0[t0 >> 24] ^ AES_Td1[(t3 >> 16) & 0xff] ^ AES_Td2[(t2 >>  8) & 0xff] ^ AES_Td3[t1 & 0xff] ^ rk[24];
+    s1 = AES_Td0[t1 >> 24] ^ AES_Td1[(t0 >> 16) & 0xff] ^ AES_Td2[(t3 >>  8) & 0xff] ^ AES_Td3[t2 & 0xff] ^ rk[25];
+    s2 = AES_Td0[t2 >> 24] ^ AES_Td1[(t1 >> 16) & 0xff] ^ AES_Td2[(t0 >>  8) & 0xff] ^ AES_Td3[t3 & 0xff] ^ rk[26];
+    s3 = AES_Td0[t3 >> 24] ^ AES_Td1[(t2 >> 16) & 0xff] ^ AES_Td2[(t1 >>  8) & 0xff] ^ AES_Td3[t0 & 0xff] ^ rk[27];
     /* round 7: */
-    t0 = Td0[s0 >> 24] ^ Td1[(s3 >> 16) & 0xff] ^ Td2[(s2 >>  8) & 0xff] ^ Td3[s1 & 0xff] ^ rk[28];
-    t1 = Td0[s1 >> 24] ^ Td1[(s0 >> 16) & 0xff] ^ Td2[(s3 >>  8) & 0xff] ^ Td3[s2 & 0xff] ^ rk[29];
-    t2 = Td0[s2 >> 24] ^ Td1[(s1 >> 16) & 0xff] ^ Td2[(s0 >>  8) & 0xff] ^ Td3[s3 & 0xff] ^ rk[30];
-    t3 = Td0[s3 >> 24] ^ Td1[(s2 >> 16) & 0xff] ^ Td2[(s1 >>  8) & 0xff] ^ Td3[s0 & 0xff] ^ rk[31];
+    t0 = AES_Td0[s0 >> 24] ^ AES_Td1[(s3 >> 16) & 0xff] ^ AES_Td2[(s2 >>  8) & 0xff] ^ AES_Td3[s1 & 0xff] ^ rk[28];
+    t1 = AES_Td0[s1 >> 24] ^ AES_Td1[(s0 >> 16) & 0xff] ^ AES_Td2[(s3 >>  8) & 0xff] ^ AES_Td3[s2 & 0xff] ^ rk[29];
+    t2 = AES_Td0[s2 >> 24] ^ AES_Td1[(s1 >> 16) & 0xff] ^ AES_Td2[(s0 >>  8) & 0xff] ^ AES_Td3[s3 & 0xff] ^ rk[30];
+    t3 = AES_Td0[s3 >> 24] ^ AES_Td1[(s2 >> 16) & 0xff] ^ AES_Td2[(s1 >>  8) & 0xff] ^ AES_Td3[s0 & 0xff] ^ rk[31];
     /* round 8: */
-    s0 = Td0[t0 >> 24] ^ Td1[(t3 >> 16) & 0xff] ^ Td2[(t2 >>  8) & 0xff] ^ Td3[t1 & 0xff] ^ rk[32];
-    s1 = Td0[t1 >> 24] ^ Td1[(t0 >> 16) & 0xff] ^ Td2[(t3 >>  8) & 0xff] ^ Td3[t2 & 0xff] ^ rk[33];
-    s2 = Td0[t2 >> 24] ^ Td1[(t1 >> 16) & 0xff] ^ Td2[(t0 >>  8) & 0xff] ^ Td3[t3 & 0xff] ^ rk[34];
-    s3 = Td0[t3 >> 24] ^ Td1[(t2 >> 16) & 0xff] ^ Td2[(t1 >>  8) & 0xff] ^ Td3[t0 & 0xff] ^ rk[35];
+    s0 = AES_Td0[t0 >> 24] ^ AES_Td1[(t3 >> 16) & 0xff] ^ AES_Td2[(t2 >>  8) & 0xff] ^ AES_Td3[t1 & 0xff] ^ rk[32];
+    s1 = AES_Td0[t1 >> 24] ^ AES_Td1[(t0 >> 16) & 0xff] ^ AES_Td2[(t3 >>  8) & 0xff] ^ AES_Td3[t2 & 0xff] ^ rk[33];
+    s2 = AES_Td0[t2 >> 24] ^ AES_Td1[(t1 >> 16) & 0xff] ^ AES_Td2[(t0 >>  8) & 0xff] ^ AES_Td3[t3 & 0xff] ^ rk[34];
+    s3 = AES_Td0[t3 >> 24] ^ AES_Td1[(t2 >> 16) & 0xff] ^ AES_Td2[(t1 >>  8) & 0xff] ^ AES_Td3[t0 & 0xff] ^ rk[35];
     /* round 9: */
-    t0 = Td0[s0 >> 24] ^ Td1[(s3 >> 16) & 0xff] ^ Td2[(s2 >>  8) & 0xff] ^ Td3[s1 & 0xff] ^ rk[36];
-    t1 = Td0[s1 >> 24] ^ Td1[(s0 >> 16) & 0xff] ^ Td2[(s3 >>  8) & 0xff] ^ Td3[s2 & 0xff] ^ rk[37];
-    t2 = Td0[s2 >> 24] ^ Td1[(s1 >> 16) & 0xff] ^ Td2[(s0 >>  8) & 0xff] ^ Td3[s3 & 0xff] ^ rk[38];
-    t3 = Td0[s3 >> 24] ^ Td1[(s2 >> 16) & 0xff] ^ Td2[(s1 >>  8) & 0xff] ^ Td3[s0 & 0xff] ^ rk[39];
+    t0 = AES_Td0[s0 >> 24] ^ AES_Td1[(s3 >> 16) & 0xff] ^ AES_Td2[(s2 >>  8) & 0xff] ^ AES_Td3[s1 & 0xff] ^ rk[36];
+    t1 = AES_Td0[s1 >> 24] ^ AES_Td1[(s0 >> 16) & 0xff] ^ AES_Td2[(s3 >>  8) & 0xff] ^ AES_Td3[s2 & 0xff] ^ rk[37];
+    t2 = AES_Td0[s2 >> 24] ^ AES_Td1[(s1 >> 16) & 0xff] ^ AES_Td2[(s0 >>  8) & 0xff] ^ AES_Td3[s3 & 0xff] ^ rk[38];
+    t3 = AES_Td0[s3 >> 24] ^ AES_Td1[(s2 >> 16) & 0xff] ^ AES_Td2[(s1 >>  8) & 0xff] ^ AES_Td3[s0 & 0xff] ^ rk[39];
     if (key->rounds > 10) {
         /* round 10: */
-        s0 = Td0[t0 >> 24] ^ Td1[(t3 >> 16) & 0xff] ^ Td2[(t2 >>  8) & 0xff] ^ Td3[t1 & 0xff] ^ rk[40];
-        s1 = Td0[t1 >> 24] ^ Td1[(t0 >> 16) & 0xff] ^ Td2[(t3 >>  8) & 0xff] ^ Td3[t2 & 0xff] ^ rk[41];
-        s2 = Td0[t2 >> 24] ^ Td1[(t1 >> 16) & 0xff] ^ Td2[(t0 >>  8) & 0xff] ^ Td3[t3 & 0xff] ^ rk[42];
-        s3 = Td0[t3 >> 24] ^ Td1[(t2 >> 16) & 0xff] ^ Td2[(t1 >>  8) & 0xff] ^ Td3[t0 & 0xff] ^ rk[43];
+        s0 = AES_Td0[t0 >> 24] ^ AES_Td1[(t3 >> 16) & 0xff] ^ AES_Td2[(t2 >>  8) & 0xff] ^ AES_Td3[t1 & 0xff] ^ rk[40];
+        s1 = AES_Td0[t1 >> 24] ^ AES_Td1[(t0 >> 16) & 0xff] ^ AES_Td2[(t3 >>  8) & 0xff] ^ AES_Td3[t2 & 0xff] ^ rk[41];
+        s2 = AES_Td0[t2 >> 24] ^ AES_Td1[(t1 >> 16) & 0xff] ^ AES_Td2[(t0 >>  8) & 0xff] ^ AES_Td3[t3 & 0xff] ^ rk[42];
+        s3 = AES_Td0[t3 >> 24] ^ AES_Td1[(t2 >> 16) & 0xff] ^ AES_Td2[(t1 >>  8) & 0xff] ^ AES_Td3[t0 & 0xff] ^ rk[43];
         /* round 11: */
-        t0 = Td0[s0 >> 24] ^ Td1[(s3 >> 16) & 0xff] ^ Td2[(s2 >>  8) & 0xff] ^ Td3[s1 & 0xff] ^ rk[44];
-        t1 = Td0[s1 >> 24] ^ Td1[(s0 >> 16) & 0xff] ^ Td2[(s3 >>  8) & 0xff] ^ Td3[s2 & 0xff] ^ rk[45];
-        t2 = Td0[s2 >> 24] ^ Td1[(s1 >> 16) & 0xff] ^ Td2[(s0 >>  8) & 0xff] ^ Td3[s3 & 0xff] ^ rk[46];
-        t3 = Td0[s3 >> 24] ^ Td1[(s2 >> 16) & 0xff] ^ Td2[(s1 >>  8) & 0xff] ^ Td3[s0 & 0xff] ^ rk[47];
+        t0 = AES_Td0[s0 >> 24] ^ AES_Td1[(s3 >> 16) & 0xff] ^ AES_Td2[(s2 >>  8) & 0xff] ^ AES_Td3[s1 & 0xff] ^ rk[44];
+        t1 = AES_Td0[s1 >> 24] ^ AES_Td1[(s0 >> 16) & 0xff] ^ AES_Td2[(s3 >>  8) & 0xff] ^ AES_Td3[s2 & 0xff] ^ rk[45];
+        t2 = AES_Td0[s2 >> 24] ^ AES_Td1[(s1 >> 16) & 0xff] ^ AES_Td2[(s0 >>  8) & 0xff] ^ AES_Td3[s3 & 0xff] ^ rk[46];
+        t3 = AES_Td0[s3 >> 24] ^ AES_Td1[(s2 >> 16) & 0xff] ^ AES_Td2[(s1 >>  8) & 0xff] ^ AES_Td3[s0 & 0xff] ^ rk[47];
         if (key->rounds > 12) {
             /* round 12: */
-            s0 = Td0[t0 >> 24] ^ Td1[(t3 >> 16) & 0xff] ^ Td2[(t2 >>  8) & 0xff] ^ Td3[t1 & 0xff] ^ rk[48];
-            s1 = Td0[t1 >> 24] ^ Td1[(t0 >> 16) & 0xff] ^ Td2[(t3 >>  8) & 0xff] ^ Td3[t2 & 0xff] ^ rk[49];
-            s2 = Td0[t2 >> 24] ^ Td1[(t1 >> 16) & 0xff] ^ Td2[(t0 >>  8) & 0xff] ^ Td3[t3 & 0xff] ^ rk[50];
-            s3 = Td0[t3 >> 24] ^ Td1[(t2 >> 16) & 0xff] ^ Td2[(t1 >>  8) & 0xff] ^ Td3[t0 & 0xff] ^ rk[51];
+            s0 = AES_Td0[t0 >> 24] ^ AES_Td1[(t3 >> 16) & 0xff] ^ AES_Td2[(t2 >>  8) & 0xff] ^ AES_Td3[t1 & 0xff] ^ rk[48];
+            s1 = AES_Td0[t1 >> 24] ^ AES_Td1[(t0 >> 16) & 0xff] ^ AES_Td2[(t3 >>  8) & 0xff] ^ AES_Td3[t2 & 0xff] ^ rk[49];
+            s2 = AES_Td0[t2 >> 24] ^ AES_Td1[(t1 >> 16) & 0xff] ^ AES_Td2[(t0 >>  8) & 0xff] ^ AES_Td3[t3 & 0xff] ^ rk[50];
+            s3 = AES_Td0[t3 >> 24] ^ AES_Td1[(t2 >> 16) & 0xff] ^ AES_Td2[(t1 >>  8) & 0xff] ^ AES_Td3[t0 & 0xff] ^ rk[51];
             /* round 13: */
-            t0 = Td0[s0 >> 24] ^ Td1[(s3 >> 16) & 0xff] ^ Td2[(s2 >>  8) & 0xff] ^ Td3[s1 & 0xff] ^ rk[52];
-            t1 = Td0[s1 >> 24] ^ Td1[(s0 >> 16) & 0xff] ^ Td2[(s3 >>  8) & 0xff] ^ Td3[s2 & 0xff] ^ rk[53];
-            t2 = Td0[s2 >> 24] ^ Td1[(s1 >> 16) & 0xff] ^ Td2[(s0 >>  8) & 0xff] ^ Td3[s3 & 0xff] ^ rk[54];
-            t3 = Td0[s3 >> 24] ^ Td1[(s2 >> 16) & 0xff] ^ Td2[(s1 >>  8) & 0xff] ^ Td3[s0 & 0xff] ^ rk[55];
+            t0 = AES_Td0[s0 >> 24] ^ AES_Td1[(s3 >> 16) & 0xff] ^ AES_Td2[(s2 >>  8) & 0xff] ^ AES_Td3[s1 & 0xff] ^ rk[52];
+            t1 = AES_Td0[s1 >> 24] ^ AES_Td1[(s0 >> 16) & 0xff] ^ AES_Td2[(s3 >>  8) & 0xff] ^ AES_Td3[s2 & 0xff] ^ rk[53];
+            t2 = AES_Td0[s2 >> 24] ^ AES_Td1[(s1 >> 16) & 0xff] ^ AES_Td2[(s0 >>  8) & 0xff] ^ AES_Td3[s3 & 0xff] ^ rk[54];
+            t3 = AES_Td0[s3 >> 24] ^ AES_Td1[(s2 >> 16) & 0xff] ^ AES_Td2[(s1 >>  8) & 0xff] ^ AES_Td3[s0 & 0xff] ^ rk[55];
         }
     }
 	rk += key->rounds << 2;
@@ -1171,28 +1171,28 @@ void AES_decrypt(const unsigned char *in, unsigned char *out,
     r = key->rounds >> 1;
     for (;;) {
         t0 =
-            Td0[(s0 >> 24)       ] ^
-            Td1[(s3 >> 16) & 0xff] ^
-            Td2[(s2 >>  8) & 0xff] ^
-            Td3[(s1      ) & 0xff] ^
+            AES_Td0[(s0 >> 24)       ] ^
+            AES_Td1[(s3 >> 16) & 0xff] ^
+            AES_Td2[(s2 >>  8) & 0xff] ^
+            AES_Td3[(s1      ) & 0xff] ^
             rk[4];
         t1 =
-            Td0[(s1 >> 24)       ] ^
-            Td1[(s0 >> 16) & 0xff] ^
-            Td2[(s3 >>  8) & 0xff] ^
-            Td3[(s2      ) & 0xff] ^
+            AES_Td0[(s1 >> 24)       ] ^
+            AES_Td1[(s0 >> 16) & 0xff] ^
+            AES_Td2[(s3 >>  8) & 0xff] ^
+            AES_Td3[(s2      ) & 0xff] ^
             rk[5];
         t2 =
-            Td0[(s2 >> 24)       ] ^
-            Td1[(s1 >> 16) & 0xff] ^
-            Td2[(s0 >>  8) & 0xff] ^
-            Td3[(s3      ) & 0xff] ^
+            AES_Td0[(s2 >> 24)       ] ^
+            AES_Td1[(s1 >> 16) & 0xff] ^
+            AES_Td2[(s0 >>  8) & 0xff] ^
+            AES_Td3[(s3      ) & 0xff] ^
             rk[6];
         t3 =
-            Td0[(s3 >> 24)       ] ^
-            Td1[(s2 >> 16) & 0xff] ^
-            Td2[(s1 >>  8) & 0xff] ^
-            Td3[(s0      ) & 0xff] ^
+            AES_Td0[(s3 >> 24)       ] ^
+            AES_Td1[(s2 >> 16) & 0xff] ^
+            AES_Td2[(s1 >>  8) & 0xff] ^
+            AES_Td3[(s0      ) & 0xff] ^
             rk[7];
 
         rk += 8;
@@ -1201,28 +1201,28 @@ void AES_decrypt(const unsigned char *in, unsigned char *out,
         }
 
         s0 =
-            Td0[(t0 >> 24)       ] ^
-            Td1[(t3 >> 16) & 0xff] ^
-            Td2[(t2 >>  8) & 0xff] ^
-            Td3[(t1      ) & 0xff] ^
+            AES_Td0[(t0 >> 24)       ] ^
+            AES_Td1[(t3 >> 16) & 0xff] ^
+            AES_Td2[(t2 >>  8) & 0xff] ^
+            AES_Td3[(t1      ) & 0xff] ^
             rk[0];
         s1 =
-            Td0[(t1 >> 24)       ] ^
-            Td1[(t0 >> 16) & 0xff] ^
-            Td2[(t3 >>  8) & 0xff] ^
-            Td3[(t2      ) & 0xff] ^
+            AES_Td0[(t1 >> 24)       ] ^
+            AES_Td1[(t0 >> 16) & 0xff] ^
+            AES_Td2[(t3 >>  8) & 0xff] ^
+            AES_Td3[(t2      ) & 0xff] ^
             rk[1];
         s2 =
-            Td0[(t2 >> 24)       ] ^
-            Td1[(t1 >> 16) & 0xff] ^
-            Td2[(t0 >>  8) & 0xff] ^
-            Td3[(t3      ) & 0xff] ^
+            AES_Td0[(t2 >> 24)       ] ^
+            AES_Td1[(t1 >> 16) & 0xff] ^
+            AES_Td2[(t0 >>  8) & 0xff] ^
+            AES_Td3[(t3      ) & 0xff] ^
             rk[2];
         s3 =
-            Td0[(t3 >> 24)       ] ^
-            Td1[(t2 >> 16) & 0xff] ^
-            Td2[(t1 >>  8) & 0xff] ^
-            Td3[(t0      ) & 0xff] ^
+            AES_Td0[(t3 >> 24)       ] ^
+            AES_Td1[(t2 >> 16) & 0xff] ^
+            AES_Td2[(t1 >>  8) & 0xff] ^
+            AES_Td3[(t0      ) & 0xff] ^
             rk[3];
     }
 #endif /* ?FULL_UNROLL */
@@ -1231,31 +1231,31 @@ void AES_decrypt(const unsigned char *in, unsigned char *out,
 	 * map cipher state to byte array block:
 	 */
    	s0 =
-   		(Td4[(t0 >> 24)       ] & 0xff000000) ^
-   		(Td4[(t3 >> 16) & 0xff] & 0x00ff0000) ^
-   		(Td4[(t2 >>  8) & 0xff] & 0x0000ff00) ^
-   		(Td4[(t1      ) & 0xff] & 0x000000ff) ^
+                (AES_Td4[(t0 >> 24)       ] & 0xff000000) ^
+                (AES_Td4[(t3 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Td4[(t2 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Td4[(t1      ) & 0xff] & 0x000000ff) ^
    		rk[0];
 	PUTU32(out     , s0);
    	s1 =
-   		(Td4[(t1 >> 24)       ] & 0xff000000) ^
-   		(Td4[(t0 >> 16) & 0xff] & 0x00ff0000) ^
-   		(Td4[(t3 >>  8) & 0xff] & 0x0000ff00) ^
-   		(Td4[(t2      ) & 0xff] & 0x000000ff) ^
+                (AES_Td4[(t1 >> 24)       ] & 0xff000000) ^
+                (AES_Td4[(t0 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Td4[(t3 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Td4[(t2      ) & 0xff] & 0x000000ff) ^
    		rk[1];
 	PUTU32(out +  4, s1);
    	s2 =
-   		(Td4[(t2 >> 24)       ] & 0xff000000) ^
-   		(Td4[(t1 >> 16) & 0xff] & 0x00ff0000) ^
-   		(Td4[(t0 >>  8) & 0xff] & 0x0000ff00) ^
-   		(Td4[(t3      ) & 0xff] & 0x000000ff) ^
+                (AES_Td4[(t2 >> 24)       ] & 0xff000000) ^
+                (AES_Td4[(t1 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Td4[(t0 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Td4[(t3      ) & 0xff] & 0x000000ff) ^
    		rk[2];
 	PUTU32(out +  8, s2);
    	s3 =
-   		(Td4[(t3 >> 24)       ] & 0xff000000) ^
-   		(Td4[(t2 >> 16) & 0xff] & 0x00ff0000) ^
-   		(Td4[(t1 >>  8) & 0xff] & 0x0000ff00) ^
-   		(Td4[(t0      ) & 0xff] & 0x000000ff) ^
+                (AES_Td4[(t3 >> 24)       ] & 0xff000000) ^
+                (AES_Td4[(t2 >> 16) & 0xff] & 0x00ff0000) ^
+                (AES_Td4[(t1 >>  8) & 0xff] & 0x0000ff00) ^
+                (AES_Td4[(t0      ) & 0xff] & 0x000000ff) ^
    		rk[3];
 	PUTU32(out + 12, s3);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 7/7] target-i386: add AES-NI instructions
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
                   ` (5 preceding siblings ...)
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 6/7] aes: make Td[0-5] and Te[0-5] tables non static Aurelien Jarno
@ 2013-03-31 11:02 ` Aurelien Jarno
  2013-03-31 23:26 ` [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and " Richard Henderson
  7 siblings, 0 replies; 11+ messages in thread
From: Aurelien Jarno @ 2013-03-31 11:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Aurelien Jarno

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 disas/i386.c                 |    4 +-
 target-i386/cpu.c            |    6 +--
 target-i386/fpu_helper.c     |    1 +
 target-i386/ops_sse.h        |   87 ++++++++++++++++++++++++++++++++++++++++++
 target-i386/ops_sse_header.h |    6 +++
 target-i386/translate.c      |    7 ++++
 6 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/disas/i386.c b/disas/i386.c
index 04c033c..47f1f2e 100644
--- a/disas/i386.c
+++ b/disas/i386.c
@@ -1447,7 +1447,7 @@ static const unsigned char threebyte_0x38_uses_DATA_prefix[256] = {
   /* a0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* af */
   /* b0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* bf */
   /* c0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* cf */
-  /* d0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* df */
+  /* d0 */ 0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1, /* df */
   /* e0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* ef */
   /* f0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* ff */
   /*       -------------------------------        */
@@ -1519,7 +1519,7 @@ static const unsigned char threebyte_0x3a_uses_DATA_prefix[256] = {
   /* a0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* af */
   /* b0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* bf */
   /* c0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* cf */
-  /* d0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* df */
+  /* d0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1, /* df */
   /* e0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* ef */
   /* f0 */ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* ff */
   /*       -------------------------------        */
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 5941d40..321d945 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -390,13 +390,13 @@ typedef struct x86_def_t {
 #define TCG_EXT_FEATURES (CPUID_EXT_SSE3 | CPUID_EXT_PCLMULQDQ | \
           CPUID_EXT_MONITOR | CPUID_EXT_SSSE3 | CPUID_EXT_CX16 | \
           CPUID_EXT_SSE41 | CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | \
-          CPUID_EXT_MOVBE | CPUID_EXT_HYPERVISOR)
+          CPUID_EXT_MOVBE | CPUID_EXT_AES | CPUID_EXT_HYPERVISOR)
           /* missing:
           CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX,
           CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID, CPUID_EXT_FMA,
           CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA,
-          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AES,
-          CPUID_EXT_XSAVE, CPUID_EXT_OSXSAVE, CPUID_EXT_AVX, CPUID_EXT_F16C,
+          CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_XSAVE,
+          CPUID_EXT_OSXSAVE, CPUID_EXT_AVX, CPUID_EXT_F16C,
           CPUID_EXT_RDRAND */
 #define TCG_EXT2_FEATURES ((TCG_FEATURES & CPUID_EXT2_AMD_ALIASES) | \
           CPUID_EXT2_NX | CPUID_EXT2_MMXEXT | CPUID_EXT2_RDTSCP | \
diff --git a/target-i386/fpu_helper.c b/target-i386/fpu_helper.c
index 29a8fb6..c0427fe 100644
--- a/target-i386/fpu_helper.c
+++ b/target-i386/fpu_helper.c
@@ -20,6 +20,7 @@
 #include <math.h>
 #include "cpu.h"
 #include "helper.h"
+#include "qemu/aes.h"
 #include "qemu/host-utils.h"
 
 #if !defined(CONFIG_USER_ONLY)
diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h
index 2ee5b8d..eb24b5f 100644
--- a/target-i386/ops_sse.h
+++ b/target-i386/ops_sse.h
@@ -2203,6 +2203,93 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
     d->Q(0) = resl;
     d->Q(1) = resh;
 }
+
+/* AES-NI op helpers */
+static const uint8_t aes_shifts[16] = {
+    0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12, 1, 6, 11
+};
+
+static const uint8_t aes_ishifts[16] = {
+    0, 13, 10, 7, 4, 1, 14, 11, 8, 5, 2, 15, 12, 9, 6, 3
+};
+
+void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    int i;
+    Reg st = *d;
+    Reg rk = *s;
+
+    for (i = 0 ; i < 4 ; i++) {
+        d->L(i) = rk.L(i) ^ bswap32(AES_Td0[st.B(aes_ishifts[4*i+0])] ^
+                                    AES_Td1[st.B(aes_ishifts[4*i+1])] ^
+                                    AES_Td2[st.B(aes_ishifts[4*i+2])] ^
+                                    AES_Td3[st.B(aes_ishifts[4*i+3])]);
+    }
+}
+
+void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    int i;
+    Reg st = *d;
+    Reg rk = *s;
+
+    for (i = 0; i < 16; i++) {
+        d->B(i) = rk.B(i) ^ (AES_Td4[st.B(aes_ishifts[i])] & 0xff);
+    }
+}
+
+void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    int i;
+    Reg st = *d;
+    Reg rk = *s;
+
+    for (i = 0 ; i < 4 ; i++) {
+        d->L(i) = rk.L(i) ^ bswap32(AES_Te0[st.B(aes_shifts[4*i+0])] ^
+                                    AES_Te1[st.B(aes_shifts[4*i+1])] ^
+                                    AES_Te2[st.B(aes_shifts[4*i+2])] ^
+                                    AES_Te3[st.B(aes_shifts[4*i+3])]);
+    }
+}
+
+void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    int i;
+    Reg st = *d;
+    Reg rk = *s;
+
+    for (i = 0; i < 16; i++) {
+        d->B(i) = rk.B(i) ^ (AES_Te4[st.B(aes_shifts[i])] & 0xff);
+    }
+
+}
+
+void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+    int i;
+    Reg tmp = *s;
+
+    for (i = 0 ; i < 4 ; i++) {
+        d->L(i) = bswap32(AES_Td0[AES_Te4[tmp.B(4*i+0)] & 0xff] ^
+                          AES_Td1[AES_Te4[tmp.B(4*i+1)] & 0xff] ^
+                          AES_Td2[AES_Te4[tmp.B(4*i+2)] & 0xff] ^
+                          AES_Td3[AES_Te4[tmp.B(4*i+3)] & 0xff]);
+    }
+}
+
+void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+                                          uint32_t ctrl)
+{
+    int i;
+    Reg tmp = *s;
+
+    for (i = 0 ; i < 4 ; i++) {
+        d->B(i) = AES_Te4[tmp.B(i + 4)] & 0xff;
+        d->B(i + 8) = AES_Te4[tmp.B(i + 12)] & 0xff;
+    }
+    d->L(1) = (d->L(0) << 24 | d->L(0) >> 8) ^ ctrl;
+    d->L(3) = (d->L(2) << 24 | d->L(2) >> 8) ^ ctrl;
+}
 #endif
 
 #undef SHIFT
diff --git a/target-i386/ops_sse_header.h b/target-i386/ops_sse_header.h
index 2842233..a68c7cc 100644
--- a/target-i386/ops_sse_header.h
+++ b/target-i386/ops_sse_header.h
@@ -338,6 +338,12 @@ DEF_HELPER_3(popcnt, tl, env, tl, i32)
 
 /* AES-NI op helpers */
 #if SHIFT == 1
+DEF_HELPER_3(glue(aesdec, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(aesdeclast, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(aesenc, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(aesenclast, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(aesimc, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32)
 DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32)
 #endif
 
diff --git a/target-i386/translate.c b/target-i386/translate.c
index d649e99..233f24f 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -3149,6 +3149,7 @@ struct SSEOpHelper_eppi {
 #define SSE41_SPECIAL { { NULL, SSE_SPECIAL }, CPUID_EXT_SSE41 }
 #define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \
         CPUID_EXT_PCLMULQDQ }
+#define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES }
 
 static const struct SSEOpHelper_epp sse_op_table6[256] = {
     [0x00] = SSSE3_OP(pshufb),
@@ -3197,6 +3198,11 @@ static const struct SSEOpHelper_epp sse_op_table6[256] = {
     [0x3f] = SSE41_OP(pmaxud),
     [0x40] = SSE41_OP(pmulld),
     [0x41] = SSE41_OP(phminposuw),
+    [0xdb] = AESNI_OP(aesimc),
+    [0xdc] = AESNI_OP(aesenc),
+    [0xdd] = AESNI_OP(aesenclast),
+    [0xde] = AESNI_OP(aesdec),
+    [0xdf] = AESNI_OP(aesdeclast),
 };
 
 static const struct SSEOpHelper_eppi sse_op_table7[256] = {
@@ -3223,6 +3229,7 @@ static const struct SSEOpHelper_eppi sse_op_table7[256] = {
     [0x61] = SSE42_OP(pcmpestri),
     [0x62] = SSE42_OP(pcmpistrm),
     [0x63] = SSE42_OP(pcmpistri),
+    [0xdf] = AESNI_OP(aeskeygenassist),
 };
 
 static void gen_sse(CPUX86State *env, DisasContext *s, int b,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions
  2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
                   ` (6 preceding siblings ...)
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 7/7] target-i386: add AES-NI instructions Aurelien Jarno
@ 2013-03-31 23:26 ` Richard Henderson
  7 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2013-03-31 23:26 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel

On 2013-03-31 04:02, Aurelien Jarno wrote:
> Changes v3 -> v4:
> - Update dissassembler code to support these instructions.

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/7] aes: move aes.h from include/block to include/qemu
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 5/7] aes: move aes.h from include/block to include/qemu Aurelien Jarno
@ 2013-04-05 12:34   ` Stefan Hajnoczi
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Hajnoczi @ 2013-04-05 12:34 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Kevin Wolf, qemu-devel

On Sun, Mar 31, 2013 at 01:02:24PM +0200, Aurelien Jarno wrote:
> Move aes.h from include/block to include/qemu to show it can be reused
> by other subsystems.
> 
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  block/qcow.c        |    2 +-
>  block/qcow2.c       |    2 +-
>  block/qcow2.h       |    2 +-
>  include/block/aes.h |   26 --------------------------
>  include/qemu/aes.h  |   26 ++++++++++++++++++++++++++
>  util/aes.c          |    2 +-
>  6 files changed, 30 insertions(+), 30 deletions(-)
>  delete mode 100644 include/block/aes.h
>  create mode 100644 include/qemu/aes.h

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 3/7] target-i386: enable PCLMULQDQ on Westmere CPU
  2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 3/7] target-i386: enable PCLMULQDQ on Westmere CPU Aurelien Jarno
@ 2013-08-07 22:12   ` Eduardo Habkost
  0 siblings, 0 replies; 11+ messages in thread
From: Eduardo Habkost @ 2013-08-07 22:12 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, Andreas Färber

On Sun, Mar 31, 2013 at 01:02:22PM +0200, Aurelien Jarno wrote:
> The PCLMULQDQ instruction has been introduced on the Westmere CPU.
> 
> Reviewed-by: Richard Henderson <rth@twiddle.net>
> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-i386/cpu.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 41382c5..5941d40 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -687,7 +687,7 @@ static x86_def_t builtin_x86_defs[] = {
>               CPUID_DE | CPUID_FP87,
>          .ext_features = CPUID_EXT_AES | CPUID_EXT_POPCNT | CPUID_EXT_SSE42 |
>               CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
> -             CPUID_EXT_SSE3,
> +             CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3,

I have just noticed that this patch entered v1.5.0 without any code to
keep pc-1.4 and older machine-types compatible. I will try to send a
patch in time to get this fixed on QEMU v1.6.0.


>          .ext2_features = CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX,
>          .ext3_features = CPUID_EXT3_LAHF_LM,
>          .xlevel = 0x8000000A,
> -- 
> 1.7.10.4
> 
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-08-07 22:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-31 11:02 [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and AES-NI instructions Aurelien Jarno
2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 1/7] disas/i386.c: disassemble pclmulqdq instruction Aurelien Jarno
2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 2/7] target-i386: add " Aurelien Jarno
2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 3/7] target-i386: enable PCLMULQDQ on Westmere CPU Aurelien Jarno
2013-08-07 22:12   ` Eduardo Habkost
2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 4/7] disas/i386.c: disassemble aes-ni instructions Aurelien Jarno
2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 5/7] aes: move aes.h from include/block to include/qemu Aurelien Jarno
2013-04-05 12:34   ` Stefan Hajnoczi
2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 6/7] aes: make Td[0-5] and Te[0-5] tables non static Aurelien Jarno
2013-03-31 11:02 ` [Qemu-devel] [PATCH v4 7/7] target-i386: add AES-NI instructions Aurelien Jarno
2013-03-31 23:26 ` [Qemu-devel] [PATCH v4 0/7] target-i386: add PCLMULQDQ and " Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).